(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(19) World Intellectual Property Organization 
International Bureau 

(43) International Publication Date 
15 March 2001 (15.03.2001) 




PCT 



(10) International Publication Number 

WO 01/18225 Al 



(51) International Patent Classification 7 : C12N 15/90, 
A01K 67/027 

(21) International Application Number: PCT/US99/30O78 

(22) International Filing Date; 

16 December 1999 (16.12.1999) 

(25) Filing Language: English 

(26) Publication Language: English 

(30) Priority Data: 

60/152,522 3 September 1999 (03.09.1999) US 

(71) Applicant: XENOGEN CORPORATION [US/US]; 860 
Atlantic Avenue, Alameda, CA 94501 (US). 

(72) Inventor: ZHANG, Ning; Xenogen Corporation, 860 At- 
lantic Avenue, Alameda, CA 94501 (US). 



(74) Agent: SHOLTZ, Charles, K.; Xenogen Corporation, 
==* 860 Atlantic Avenue, Alameda, CA 94501 (US). 



(81) Designated States (national)*. AE, AL, AM, AT. AU, AZ, 
BA, BB, BG, BR, BY, CA, CH, CN t CU, CZ, DE, DK, EE, 
ES, H, GB, GD, GE, GH, GM, HR, HU, ID, 1L, IN, IS, JP, 
KB, KG, KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MD, 
MG, MK, MN, MW, MX, NO, NZ, PL, PT, RO, RU, SD, 
SE, SG, SI, SK, SL, TJ, TM, TR, IT, UA, UG, UZ, VN, 
YU. ZA, ZW. 

(84) Designated States (regional): ARIPO patent (GH, GM, 
KB, LS, MW, SD, SL, SZ, TZ, UG, ZW), Eurasian patent 
(AM, AZ, BY, KG, KZ, MD, RU, TJ, TM), European patent 
(AT, BE, CH, CY, DE, DK, ES, FI, FR, GB, GR, EE, IT, LU, 
MC, NL, PT, SE), OAPI patent (BP, BJ, CF, CG, a, CM, 
GA, GN, GW, ML, MR, NE, SN, TD, TG). 

Published: 

— With international search report. 

For two-letter codes and other abbreviations, refer to the "Guid- 
ance Notes on Codes and Abbreviations" appearing at the begin* 
ning of each regular issue of the PCT Gazette. 



(54) Title: TARGETING CONSTRUCTS AND TRANSGENIC ANIMALS PRODUCED THEREWITH 



VEGF Promoter (7.8 W>) 



ATG 



IT) 
00 




stop 



VUrpnecflngene 

(57) Abstract: The present invention teaches targeting constructs and methods of use thereof for creating transgenic animals in 
which at least one single-copy, non-essential gene is replaced with a reporter expression cassette, for example, a luciferase gene op- 
erably linked to a promoter heterologous to the single-copy, non-essential gene. Thus, the present invention provides novel methods 
and vector constructs useful for the generation of transgenic animals. The invention further includes methods of using these animals. 



BEST AVAILABLE COPY 



WO 01/18225 PCT/US99/30078 



1 



TARGETING CONSTRUCTS AND TRANSGENIC ANIMALS 

PRODUCED THEREWITH 



5 TECHNICAL FIELD 

This invention is in the field of molecular biology and medicine. More 
specifically, it relates to novel vector constructs and methods of use thereof for 
introducing heterologous polynucleotides into a host cell. Further, the invention relates 
to vector constructs and methods of use thereof to generate transgenic organisms, 
10 particularly transgenic mice. 

BACKGROUND 

In recent years, mouse geneticists have succeeded in creating transgenic animals 
by manipulating the genes of developing embryos and introducing foreign genes into 

15 these embryos. Once these genes have integrated into the genome of the recipient 
embryo, the resulting embryos or adult animals can be analyzed to determine the 
function of the gene. 

Traditionally, transgenic mice have been generated through DNA microinjection 
approach. Such an approach leads to the creation of "founder" mice having at least one 

20 copy of the transgene randomly integrated into the genome. Because neither the copy 
number nor the integration sites can be controled, transgenic mice generated by this 
method are genetically different from each other. The expression of transgenes in such 
transgenic mice are not uniform because of, for example, the difference in copy numbers 
of the transgene. Furthermore, chromosomal location of the transgene often affects the 

25 expression level of the transgene. For most of in vivo studies, particularly ones where it 
is desirable to compare levels of gene expression across different animals, it is important 
to have the mice with little or no genetic variation (i.e., isogenic mice) in order to reduce 
the systematic error. 

One way of controlling copy number and the chromosomal integration site is by 

30 using targeting constructs to create transgenic animals. U.S. Patent Nos. 5,464,764 and 
5,487,992 describe this type of transgenic animal in which a gene of interest is deleted or 
mutated sufficiently to disrupt its function. These "knock-out" animals are made by 
taking advantage of the phenomena of homologous recombination. (See, also U.S. 
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Patent Nos. 5,63 1 ,153 and 5,627,059). Using the "knock-out" approach, when the vector 
is introduced into the embryonic stem cell, a sequence becomes integrated into a target 
gene in the genome via homologous recombination. The integration disrupts the 
function of the target gene allows for examination of the phenotype resulting from the 

5 disruption of the gene. Neither of the above methods, however, provide a predictable 
approach for the generation of gene expression reporter transgenic animals which can be 
used for quantitative comparisons of gene expression across different animals transfected 
with the same construct, or for quantitative comparisons of the relative levels of 
expression of two or more different genes. 

10 The present invention solves this and other problems by providing transgenic 

animals in which at least one single-copy, non-essential gene is replaced with a reporter 
expression cassette, for example, a luciferase gene operably linked to a heterologous 
promoter. Thus, the present invention provides novel methods and vector constructs 
useful for the generation of transgenic animals. The invention further includes methods 

15 of using these animals. 

SUMMARY OF THE INVENTION 
The transgenic animals described herein are useful, for example, when studying 
in vivo regulation of selected genes. Also described herein are methods of generating 
20 populations of substantially isogenic transgenic animals, as well as, vectors useful in 
these methods. 

Accordingly, in one embodiment, the subject invention is directed to a transgenic, 
non-human mammal, for example, a rodent such as a mouse. The mammal comprises at 
least one single-copy, non-essential gene in its genome, wherein (i) at least a portion of 

25 at least one single-copy, non-essential gene is replaced by polynucleotide sequences 
heterologous to the gene, and (ii) the polynucleotide sequences comprise a first 
expression cassette which has been introduced into the mammal or an ancestor of the 
mammal, at an embryonic stage. The first expression cassette typically comprises a first 
selectable marker, a first transcriptional promoter element heterologous to the gene, and 

30 light-generating protein coding sequences. The light-generating protein coding 
sequences are operably linked to the promoter element. 
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The single-copy, non-essential gene may be selected, for example, from the 
group consisting of vitronectin, fosB, and galactin 3 and the first selectable marker may 
be selected from the group consisting of neomycin phosphotransferase 0, xanthine- 
guanine phosphoribosyltransferase, hygromycin-B-phosphotransferase, chloramphenicol 
5 acetyltransferase, and adeninephosphoribosyl transferase. 

In alternative embodiments, the first transcriptional promoter element is an 
inducible promoter, a repressible promoter, or a constitutive promoter, and may be 
selected from the group consisting of VEGF, VEGFR, and TTE2. 

In additional embodiments, the transgenic, non-human mammal described above 
10 comprises a second single-copy, non-essential gene in its genome, wherein (i) at least a 
portion of the second single-copy, non-essential gene is replaced by polynucleotide 
sequences heterologous to the second gene, and (ii) the polynucleotide sequences 
comprise a second expression cassette which has been introduced into the mammal or an 
ancestor of the mammal, at an embryonic stage. The second expression cassette 
15 typcially comprises a second selectable marker, a second transcriptional promoter 

element heterologous to the second gene, and light generating protein coding sequences. 
The light generating protein coding sequences are operably linked to the promoter 
element. 

The first and second transcriptional promoter elements and selectable markers 
20 may be the same or different and the light generating protein in the first expression 
cassette can produce a different color of light relative to the light generating protein in 
the second expression cassette. 

In yet a further embodiment, the invention is directed to a method of producing a 
transgenic, non-human mammal, such as a mouse. The mammal has at least one single- 
25 copy, non-essential gene in its genome. The method comprises 

transfecting an embryonic stem cell of the mammal with a linear vector 
comprising 

(a) a first selectable marker and a reporter expression cassette, the reporter 
expression cassette comprising a transcriptional promoter element operably linked to a 

30 light generating protein coding sequence, and 

(b) targeting polynucleotide sequences homologous to a single-copy, non- 

♦ 

essential gene in said mammal's genome, the targeting polynucleotide sequences 
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flanking (a), wherein (i) the length of the polynucleotide sequences are sufficient to 
facilitate homologous recombination between the vector and the single-copy, non- 
essential gene, and (ii) the transcriptional promoter element is heterologous to the single- 
copy, non-essential gene; 
5 selecting embryonic stem cells which each have the first selectable marker and 

reporter expression cassette integrated into its genome; 

injecting the embryonic stem cells into a host embryo, 

implanting the embryo in a foster mother, 

maintaining the foster mother under conditions which allow production of an 
10 offspring that is a transgenic, non-human mammal carrying the reporter expression 
cassette. 

In certain embodiments, the offspring is capable of germline transmission of the 
reporter expression cassette and the method may further comprise breeding the offspring 
with a mammal which is substantially isogenic with the embryonic stem cells, such that 

15 the breeding yields transgenic Fl offspring carrying the reporter cassette. In particular 
embodiments, the method comprises breeding the first Fl offspring carrying the reporter 
cassette with a second Fl offspring carrying the reporter cassette, wherein the breeding 
yields transgenic F2 offspring carrying the reporter cassette. 

In additional embodiments, the embryonic stems cells may be derived from a 

20 mouse having a dark coat color, the mammal substantially isogenic with the embryonic 
stem cells may have a light coat color, and/or the F2 offspring carrying the reporter 
cassette may have a light coat color. In particular embodiments, the embryonic stems 
cells are derived from a C57BL/6 mouse having a dark coat color, and the mammal 
substantially isogenic with the embryonic stem cells is a C57BL/6-Tyr C2j/+ mouse 

25 having a light coat color. 

In still a further embodiment, the subject invention is directed to a vector for use 
in generating a transgenic non-human mammal, for example, a rodent such as a mouse. 
The mammal has at least one single-copy, non-essential gene in its genome. The vector 
comprises 

30 (a) a first selectable marker and a reporter expression cassette, the reporter 

expression cassette comprising a transcriptional promoter element operably linked to a 
light generating protein coding sequence, and 
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(b) targeting polynucleotide sequences homologous to a single-copy, non- 
essential gene in the mammal's genome, the targeting polynucleotide sequences flanking 
(a), wherein (i) the length of the targeting polynucleotide sequences are sufficient to 
facilitate homologous recombination between the vector and the single-copy, non- 
5 essential gene, and (ii) the transcriptional promoter element is heterologous to the single- 
copy, non-essential gene. 

In certain embodiments, the first selectable marker provides a positive selection 
and may be selected from the group consisting of neomycin phosphotransferase II, 
xanthine-guanine phosphoribosyltransferase, hygromycin-B-phosphotransferase, 
10 chloramphenicol acetyltransferase, and adeninephosphoribosyl transferase. Additionally, 
the transcriptional promoter element may be an inducible promoter, a repressive 
promoter, or a constitutive promoter, and may be selected from the group consisting of 
VEGF, VEGFR, and TIE2. 

In alternative embodiments, the vector further comprises a second selectable 
15 marker and at least one target polynucleotide sequence is located between the second 
selectable marker and the first selectable marker. Additionally, the second selectable 
marker may provide a negative selection and may be selected from the group consisting 
of adenosine deaminase, thymidine kinase, and dihydrofolate reductase. 

The vectors described above may be circular and may contain at least one 
20 restriction site whose cleavage results in a linear vector having the following 

arrangement of elements: target polynucleotide sequence - (a) - targeting polynucleotide 
sequences or target polynucleotide sequence - (a) - targeting polynucleotide sequences - 
(second selectable marker). 

The coding sequences of the reporter expression cassette present in the vector 
25 may comprise codons that are optima] for expression in a host system into which the 
expression cassette is to be introduced. Additionally, the targeting polynucleotide 
sequences from single-copy, non-essential genes may be selected from the group 
consisting of vitronectin,/*^, and galactin 3. 

The light-generating protein in the mammals and methods described above may 
30 be derived from either procaryotic or eucaryotic sources and, in particularly preferred 
embodiments, the light generating protein is a luciferase. 
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These and other embodiments of the present invention will be apparent to those 
of skill in the art in view of the teachings herein. 
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BRIEF DESCRIPTION OF THE DRAWINGS 



Figure 1 is a schematic depicting construction of the pTK53 vector. 
Polynucleotides encoding PGK-P, Neo and TK and 5' and 3' linkers are introduced into 
5 a pKS backbone to produce the vector designated pTK53. 

Figure 2 is schematic depicting construction of the pTK-LucR and pTK-LucYG 
vectors. For pTK-LucR, a polynucleotide encoding LucR is introduced into pTK53. 
Thus, the pTK-LucR construct contains the PGK-P gene, a neomycin (Neo r ) gene, a 
thymidine kinase (TK) gene and sequence encoding red luciferase (Luc-R). For pTK- 
10 LucYG, a polynucleotide encoding LucYG is introduced into pTK53. Thus, the pTK- 
LucYG construct contains the PGK-P gene, a neomycin (Neo r ) gene, a thymidine kinase 
(TK) gene and a sequence encoding yellow-green luciferase (Luc-YG). 

Figures 3A is a schematic depicting the vector pTKLR-Vn. Sequences 
homologous to the vitronectin gene are inserted into pTK-LucR such that they flank the 
IS Neo r gene and the Luc-R coding sequence. Figure 3B is a schematic depicting targeting 
of the linearized pTKLR-Vn vector to the vitronectin chromosomal locus. The VEGF 
promoter is cloned into the polylinkers between Neo and Luc-R. Upon homologous 
recombination, the Neo-VEGF-LucR transgene is inserted into the Vn gene. In the 
figure, (A) shows the targeting vector pTKLR-Vn and (B) shows the mouse vitronectin 
20 gene. In the figure, Neo - neomycin resistance encoding sequences; TK - thymidine 
kinase encoding sequences; LucR - red luciferase from pGL3Red (Dr. Christopher 
Contag, Stanford University, Stanford, CA). Regions bearing Vn gene translational start 
and stop codons are indicated with arrows. Poly(A) sequences are placed upstream of the 
polylinker to prevent or minimize read-through translation. Figure 3C shows the \ 
25 nucleotide sequence of vitronectin (SEQ ID NO:38). 

Figure 4A is a schematic depicting the vector pTKLG-Fos. Sequences 
homologous to the FosB gene are inserted into pTK-LucYG such that they flank the Neo r 
gene and the Luc-YG coding sequence. Figure 4B shows the nucleotide sequence of 
FosB(SEQIDNO:39). 

30 Figure 5A is a schematic depicting targeting of the linearized pTKLG-Fos vector 

to the FosB chromosomal locus. The VEGFR2 promoter is cloned into the polylinkers 
between Neo and Luc-YG. Upon homologous recombination, the Neo-VEGFR2-LucYG 
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transgene will be inserted into a sequence associated with production of FosB. In the 
figure, (A) shows the targeting vector, and (B) shows the mouse target gene. In the 
figure, Neo - neomycin resistance encoding sequences; TK - thymidine kinase encoding 
sequences; LucYG - yellow green luciferase from pGL3-control vector (Prornega, 
5 Madison, WI). Regions bearing FosB gene translational start and stop codons are 
indicated with arrows. Poly(A) sequences arc placed upstream of the polylinker to 
prevent or minimize read-through translation. Figure SB is a schematic depicting 
targeting of the linearized pTKLG-Fos vector to the FosB chromosomal locus. The 
TBE2 promoter is cloned into the polylinkers between Neo and Luc-R. Upon 
10 homologous recombination, the Neo-Tie2-LucYG transgene is inserted into the FosB 
gene. In the figure, (A) shows the targeting vector, and (B) shows the mouse target gene. 
In the figure, Neo - neomycin resistance encoding sequences; TK - thymidine kinase 
encoding sequences; LucYG - yellow green luciferase from pGL3-control vector 
(Prornega), Regions bearing FosB gene translational start and stop codons are indicated 
15 with arrows. Poly(A) sequences are placed upstream of the polylinker to prevent or 
minimize read-through translation. 

Figure 6 depicts PCR conditions for genomic screening for promoters useful in 
exemplary targeting constructs of the present invention. 

Figure 7 depicts generation of targeted transgenic mice using the targeting 
20 vectors described herein. 

Figure 8 depicts of schematic representation of Southern blot analysis of 
homologous DNA recombination between pTKLG-Fos targeting vector and the FosB 
gene. 

Figure 9 depicts generation of targeted transgenic mice, using the targeting 
25 vectors described herein, and crosses using such transgenics as well as their offspring 
(Fl, first generation; F2, second generation). 

Figure 10 depicts crosses using transgenic mice of the present invention to 
generate dual luciferase transgenic mice. 



i 
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MODES FOR CARRYING OUT THE INVENTION 
Throughout this application, various publications, patents, and published patent 
applications are referred to by an identifying citation. The disclosures of these 
publications, patents, and published patent specifications referenced in this application 
5 are included to more fully describe the state of the art to which this invention pertains. 

The practice of the present invention will employ, unless otherwise indicated, 
conventional techniques of molecular biology, microbiology, cell biology and 
recombinant DNA, which are within the skill of the art. See, e.g., Sambrook, Fritsch, 
and Maniatis, MOLECULAR CLONING: A LABORATORY MANUAL, 2nd edition 
10 (1989); CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, (F.M. Ausubel et al. 
eds., 1987); the series METHODS IN ENZYMOLOGY (Academic Press, Inc.); PCR 2: 
A PRACTICAL APPROACH (M.J. McPherson, B.D. Hames and G.R. Taylor eds., 
1995) and ANIMAL CELL CULTURE (R.I. Frcshney. Ed., 1987), 

As used in this specification and the appended claims, the singular forms "a," 
15 "an" and "the" include plural references unless the content clearly dictates otherwise. 
Thus, for example, reference to "an antigen" includes a mixture of two or more such 
agents. 

Definitions 

20 As used herein, certain terms will have specific meanings. 

The terms "nucleic acid molecule" and "polynucleotide" are used interchangeably 
to and refer to a polymeric form of nucleotides of any length, either 
deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have 
any three-dimensional structure, and may perform any function, known or unknown. 

25 Non-limiting examples of polynucleotides include a gene, a gene fragment, exons, 
introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, 
recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA 
of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. 

A polynucleotide is typically composed of a specific sequence of four nucleotide 

30 bases: adenine (A); cytosine (C); guanine (G); and thymine (T) (uracil (U) for thymine 
(T) when the polynucleotide is RNA). Thus, the term polynucleotide sequence is the 
alphabetical representation of a polynucleotide molecule. This alphabetical 
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representation can be input into databases in a computer having a central processing unit 
and used for bioinformatics applications such as functional genomics and homology 
searching. 

A "coding sequence" or a sequence which "encodes" a selected polypeptide, is a 

5 nucleic acid molecule which is transcribed (in the case of DNA) and translated (in the 
case of mRNA) into a polypeptide, for example, in vivo when placed under the control of 
appropriate regulatory sequences (or "control elements*'). The boundaries of the coding 
sequence are typically determined by a start codon at the 5* (amino) terminus and a 
translation stop codon at the 3' (carboxy) terminus. A coding sequence can include, but 

10 is not limited to, cDNA from viral, procaryotic or eucaryotic mRNA, genomic DNA 
sequences from viral or procaryotic DNA, and even synthetic DNA sequences. A 
transcription termination sequence may be located 3' to the coding sequence. Other 
"control elements" may also be associated with a coding sequence. A DNA sequence 
encoding a polypeptide can be optimized for expression in a selected cell by using the 

15 codons preferred by the selected cell to represent the DNA copy of the desired 

polypeptide coding sequence. "Encoded by" refers to a nucleic acid sequence which 
codes for a polypeptide sequence, wherein the polypeptide sequence or a portion thereof 
contains an amino acid sequence of at least 3 to 5 amino acids, more preferably at least 8 
to 10 amino acids, and even more preferably at least IS to 20 amino acids from a 

20 polypeptide encoded by the nucleic acid sequence. Also encompassed are polypeptide 
sequences which are immunologically identifiable with a polypeptide encoded by the 
sequence. 

Typical "control elements", include, but are not limited to, transcription 
promoters, transcription enhancer elements, transcription termination signals, 

25 polyadenylation sequences (located 3' to the translation stop codon), sequences for 
optimization of initiation of translation (located 5' to the coding sequence), translation 
enhancing sequences, and translation termination sequences. Transcription promoters 
can include inducible promoters (where expression of a polynucleotide sequence 
operably linked to the promoter is induced by an analyte, cofactor, regulatory protein, 

30 etc.), repressible promoters (where expression of a polynucleotide sequence operably 
linked to the promoter is induced by an analyte, cofactor, regulatory protein, etc.), and 
constitutive promoters. 
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"Expression enhancing sequences" typically refer to control elements that 
improve transcription or translation of a polynucleotide relative to the expression level in 
the absence of such control elements (for example, promoters, promoter enhancers, 
enhancer elements, and translational enhancers (e.g., Shine and Delagarno sequences)). 
5 "Purified polynucleotide" refers to a polynucleotide of interest or fragment 

thereof which is essentially free, e.g., contains less than about 50%, preferably less than 
about 70%, and more preferably less than about 90%, of the protein with which the 
polynucleotide is naturally associated. Techniques for purifying polynucleotides of 
interest are well-known in the art and include, for example, disruption of the cell 

10 containing the polynucleotide with a chaotropic agent and separation of the 
polynucleotide(s) and proteins by ion-exchange chromatography, affinity 
chromatography and sedimentation according to density. 

A "heterologous sequence" as used herein is typically refers to either (i) a nucleic 
acid sequence that is not normally found in the cell or organism of interest, or (ii) a 

15 nucleic acid sequence introduced at a genomic site wherein the nucleic acid sequence 
does not normally occur in nature at that site. For example, a DNA sequence encoding a 
polypetide can be obtained from yeast and introduced into a bacterial cell. In this case 
the yeast DNA sequence is "heterologous" to the native DNA of the bacterial cell. 
Alternatively, a promoter sequence from a Tie2 gene can be introduced into the genomic 

20 location of tfosB gene. In this case the Tie2 promoter sequence is "heterologous" to the 
native fosB genomic sequence. 

A "polypeptide" is used in it broadest sense to refer to a compound of two or 
more subunit amino acids, amino acid analogs, or other peptidomimetics. The subunits 
may be linked by peptide bonds or by other bonds, for example ester, ether, etc. As used 

25 herein, the term "amino acid" refers to either natural and/or unnatural or synthetic amino 
acids, including glycine and both the D or L optical isomers, and amino acid analogs and 
peptidomimetics. A peptide of three or more amino acids is commonly called an 
oligopeptide if the peptide chain is short. If the peptide chain is long, the peptide is 
typically called a polypeptide or a protein. 

30 "Operably linked" refers to an arrangement of elements wherein the components 

so described are configured so as to perform their usual function. Thus, a given 
promoter that is operably linked to a coding sequence (e.g., a reporter expression 
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cassette) is capable of effecting the expression of the coding sequence when the proper 
enzymes are present. The promoter or other control elements need not be contiguous 
with the coding sequence, so long as they function to direct the expression thereof. For 
example, intervening untranslated yet transcribed sequences can be present between the 
5 promoter sequence and the coding sequence and the promoter sequence can still be 
considered "operably linked" to the coding sequence. 

"Recombinant" as used herein to describe a nucleic acid molecule means a 
polynucleotide of genomic, cDNA, semisynthetic, or synthetic origin which, by virtue of 
its origin or manipulation: (1) is not associated with all or a portion of the polynucleotide 

10 with which it is associated in nature; and/or (2) is linked to a polynucleotide other than 
that to which it is linked in nature. The term "recombinant" as used with respect to a 
protein or polypeptide means a polypeptide produced by expression of a recombinant 
polynucleotide. "Recombinant host cells," "host cells," "cells," "cell lines," "cell 
cultures," and other such terms denoting procaryotic microorganisms or eucaryotic cell 

15 lines cultured as unicellular entities, are used interchangeably, and refer to cells which 
can be, or have been, used as recipients for recombinant vectors or other transfer DNA, 
and include the progeny of the original cell which has been transfected. It is understood 
that the progeny of a single parental cell may not necessarily be completely identical in 
morphology or in genomic or total DNA complement to the original parent, due to 

20 accidental or deliberate mutation. Progeny of the parental cell which are sufficiently 
similar to the parent to be characterized by the relevant property, such as the presence of 
a nucleotide sequence encoding a desired peptide, are included in the progeny intended 
by this definition, and are covered by the above terms. An "isolated polynucleotide" 
molecule is a nucleic acid molecule separate and discrete from the whole organism with 

25 which the molecule is found in nature; or a nucleic acid molecule devoid, in whole or 
part, of sequences normally associated with it in nature; or a sequence, as it exists in 
nature, but having heterologous sequences (as defined below) in association therewith. 

Techniques for determining nucleic acid and amino acid "sequence identity" also 
are known in the art. Typically, such techniques include determining the nucleotide 

30 sequence of the mRNA for a gene and/or determining the amino acid sequence encoded 
thereby, and comparing these sequences to a second nucleotide or amino acid sequence. 
In general, "identity" refers to an exact nucleotide-to-nucleotide or amino acid-to-amino 
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acid correspondence of two polynucleotides or polypeptide sequences, respectively. 
Two or more sequences (polynucleotide or amino acid) can be compared by determining 
their "percent identity." The percent identity of two sequences, whether nucleic acid or 
amino acid sequences, is the number of exact matches between two aligned sequences 
5 divided by the length of the shorter sequences and multiplied by 100. An approximate 
alignment for nucleic acid sequences is provided by the local homology algorithm of 
Smith and Waterman, Advances in Applied Mathematics 2:482-489 (1981). This 
algorithm can be applied to amino acid sequences by using the scoring matrix developed 
by Dayhoff . Atlas of Protein Sequences and Structure . M.O. Dayhoff ed., 5 suppl. 3:353- 

10 358, National Biomedical Research Foundation, Washington, D.C., USA, and 

normalized by Gribskov, Nucl. Acids Res. 14(6):6745-6763 (1986). An exemplary 
implementation of this algorithm to determine percent identity of a sequence is provided 
by the Genetics Computer Group (Madison, WI) in the "BestRt" utility application. The 
default parameters for this method are described in the Wisconsin Sequence Analysis 

15 Package Program Manual, Version 8 (1995) (available from Genetics Computer Group, 
Madison, WI). A preferred method of establishing percent identity in the context of the 
present invention is to use the MPSRCH package of programs copyrighted by the 
University of Edinburgh, developed by John F. Collins and Shane S. Sturrok, and 
distributed by IntelliGenetics, Inc. (Mountain View, CA). From this suite of packages 

20 the Smith-Waterman algorithm can be employed where default parameters are used for 
the scoring table (for example, gap open penalty of 12, gap extension penalty of one, and 
a gap of six). From the data generated the "Match" value reflects "sequence identity." 
Other suitable programs for calculating the percent identity or similarity between 
sequences are generally known in the art, for example, another alignment program is 

25 BLAST, used with default parameters. For example, BLASTN and BLASTP can be 
used using the following default parameters: genetic code = standard; filter = none; 
strand = both; cutoff = 60; expect = 10; Matrix = BLOSUM62; Descriptions = 50 
sequences; sort by = HIGH SCORE; Databases = non-redundant, GenBank + EMBL + 
DDBJ + PDB + GenBank CDS translations + Swiss protein + Spupdate + PIR. Details 

30 of these programs can be found at the following internet address: 

http://www.ncbi.nlm.gov/cgi-bin/BLAST. When claiming sequences relative to 
sequences of the present invention, the desired degrees of sequence identity are at least 
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80%, 85-90%, preferably 92%, more preferably 95%, and even more preferably 98% 
sequence identity to the reference sequence (i.e., the sequences of the present invention). 

Alternatively, the degree of sequence similarity between polynucleotides can be 
determined by hybridization of polynucleotides under conditions that form stable 
5 duplexes between homologous regions, followed by digestion with single-stranded- 
specific nuclease(s), and size determination of the digested fragments. Two DNA, or 
two polypeptide sequences are "substantially homologous" to each other when the 
sequences exhibit at least about 80%-85%, preferably at least about 85%-90%, more 
preferably at least about 90%-95%, and most preferably at least about 95%-98% 

10 sequence identity over a defined length of the molecules, as determined using the 
methods above. As used herein, substantially homologous also refers to sequences 
showing complete identity to the specified DNA or polypeptide sequence. DNA 
sequences that are substantially homologous can be identified in a Southern 
hybridization experiment under, for example, stringent conditions, as defined for that 

15 particular system. Defining appropriate hybridization conditions is within the skill of the 
art. See, e.g., Sambrook et al., supra: DNA Cloning, supra; Nucleic Acid Hybridization, 
supra. 

Two nucleic acid fragments are considered to "selectively hybridize" as described 
herein. The degree of sequence identity between two nucleic acid molecules affects the 

20 efficiency and strength of hybridization events between such molecules. A partially 
identical nucleic acid sequence will at least partially inhibit a completely identical 
sequence from hybridizing to a target molecule. Inhibition of hybridization of the 
completely identical sequence can be assessed using hybridization assays that are well 
known in the art (e.g., Southern blot, Northern blot, solution hybridization, or the like, 

25 see Sambrook, et al.. Molecular Cloning: A Laboratory Manual, Second Edition, (1989) 
Cold Spring Harbor, N.Y.). Such assays can be conducted using varying degrees of 
selectivity, for example, using conditions varying from low to high stringency. If 
conditions of low stringency are employed, the absence of non-specific binding can be 
assessed using a secondary probe that lacks even a partial degree of sequence identity 

30 (for example, a probe having less than about 30% sequence identity with the target 

molecule), such that, in the absence of non-specific binding events, the secondary probe 
will not hybridize to the target. 
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When utilizing a hybridization-based detection system, a nucleic acid probe is 
chosen that is complementary to a target nucleic acid sequence, and then by selection of 
appropriate conditions the probe and the target sequence "selectively hybridize," or bind, 
to each other to form a hybrid molecule. A nucleic acid molecule that is capable of 

5 hybridizing selectively to a target sequence under "moderately stringent" typically 
hybridizes under conditions that allow detection of a target nucleic acid sequence of at 
least about 10-14 nucleotides in length having at least approximately 70% sequence 
identity with the sequence of the selected nucleic acid probe. Stringent hybridization 
conditions typically allow detection of target nucleic acid sequences of at least about 10- 

10 14 nucleotides in length having a sequence identity of greater than about 90-95% with 
the sequence of the selected nucleic acid probe. Hybridization conditions useful for 
probe/target hybridization where the probe and target have a specific degree of sequence 

♦ 

identity, can be determined as is known in the art (see, for example, Nucleic Acid 
Hybridization: A Practical Approach , editors B.D. Hames and S J. Higgins, (1985) 

» 

15 Oxford; Washington, DC; IRL Press). 

With respect to stringency conditions for hybridization, it is well known in the art 
that numerous equivalent conditions can be employed to establish a particular stringency 
by varying, for example, the following factors: the length and nature of probe and target 
sequences, base composition of the various sequences, concentrations of salts and other 

20 hybridization solution components, the presence or absence of blocking agents in the 
hybridization solutions (e.g., formamide, dextran sulfate, and polyethylene glycol), 
hybridization reaction temperature and time parameters, as well as, varying wash 
conditions. The selection of a particular set of hybridization conditions is selected 
following standard methods in the art (see, for example, Sambrook, et al., Molecular 

25 Cloning: A Laboratory Manual . Second Edition, (1989) Cold Spring Harbor, N.Y.). 

A "vector" is capable of transferring gene sequences to target cells. Typically, 
"vector construct," "expression vector," and "gene transfer vector," mean any nucleic 
acid construct capable of directing the expression of a gene of interest and which can 
transfer gene sequences to target cells. Thus, the term includes cloning, and expression 

30 vehicles, as well as integrating vectors. 

"Nucleic acid expression vector" or "expression cassette" refers to an assembly 
which is capable of directing the expression of a sequence or gene of interest. The 
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nucleic acid expression vector includes a promoter which is operably linked to the 
sequences or gene(s) of interest. Other control elements may be present as well. 
Expression cassettes described herein may be contained within a plasmid construct. In 
addition to the components of the expression cassette, the plasmid construct may also 
5 include a bacterial origin of replication, one or more selectable markers, a signal which 
allows the plasmid construct to exist as single-stranded DNA (e.g., a M13 origin of 
replication), a multiple cloning site, and a "mammalian" origin of replication (e.g., a 
SV40 or adenovirus origin of replication). 

An "expression cassette" comprises any nucleic acid construct capable of 
10 directing the expression of a gene/coding sequence of interest. Such cassettes can be 
constructed into a "vector," "vector construct," "expression vector," or "gene transfer 
vector," in order to transfer the expression cassette into target cells. Thus, the term 
includes cloning and expression vehicles, as well as viral vectors. 

"Luciferase," unless stated otherwise, includes prokaryotic and eukaryotic 
15 luciferases, as well as variants possessing varied or altered optical properties, such as 
luciferases that produce different colors of light (e.g., Kajiyama, N., and Nakano, E., 
Protein Engineering 4(6):691-693 (1991)). 

"Light-generating" is defined as capable of generating light through a chemical 
reaction or through the absorption of radiation. 
20 A "light generating protein" or "light-emitting protein" is a protein capable of 

generating light in the visible spectrum (between approximately 350 nm and 800 nm). 
Examples include bioluminescent protiens such as luciferases, e.g., bacterial and firefly 
luciferases, as well as fluorescent proteins such as green fluorescent protein (GFP). 

"Light" is defined herein, unless stated otherwise, as electromagnetic radiation 
25 having a wavelength of between about 300 nm and about 1 100 nm. 

"Animal" as used herein typically refers to a non-human mammal, including, 
without limitation, farm animals such as cattle, sheep, pigs, goats and horses; domestic 
mammals such as dogs and cats; laboratory animals including rodents such as mice, rats 
and guinea pigs; birds, including domestic, wild and game birds such as chickens, 
30 turkeys and other gallinaceous birds, ducks, geese, and the like. The term does not 
denote a particular age. Thus, both adult and newborn individuals are intended to be 
covered. 
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A "transgenic animal" refers to a genetically engineered animal or offspring of 
genetically engineered animals. A transgenic animal usually contains material from at 
least one unrelated organism, such as from a virus, plant, or other animal. The "non- 
human animals" of the invention include vertebrates such as rodents, non-human 

5 primates, sheep, dogs, cows, amphibians, birds, fish, insects, reptiles, etc. The term 
"chimeric animal" is used to refer to animals in which the heterologous gene is found, or 
in which the heterologous gene is expressed in some but not all cells of the animal. 

"Analyte" as used herein refers to any compound or substance whose effects 
(e.g., induction or repression of a specific promoter) can be evaluated using the test 

10 animals and methods of the present invention. Such analytes include, but are not limited 
to, chemical compounds, pharmaceutical compounds, polypeptides, peptides, 
polynucleotides, and polynucleotide analogs. Many organizations (e.g., the National 
Institutes of Health, pharmaceutical and chemical corporations) have large libraries of 
chemical or biological compounds from natural or synthetic processes, or fermentation 

15 broths or extracts. Such compounds/analytes can be employed in the practice of the 
present invention. 

As used herein, the term "positive selection marker" refers to a gene encoding a 
product that enables only the cells that carry the gene to survive and/or grow under 
certain conditions. For example, plant and animal cells that express the introduced 

20 neomycin resistance (Neo r ) gene are resistant to the compound G41 8. Cells that do not 
carry the Neo r gene marker are killed by G418. Other positive selection markers will be 
known to those of skill in the art Typically, positive selection markers encode products 
that can be readily asssayed. Thus, positive selection markers can be used to determine 
whether a particular DNA construct has been introduced into a cell, organ or tissue. 

25 "Negative selection marker" refers to gene encoding a product which can be used 

to selectively kill and/or inhibit growth of cells under certain conditions. Non-limiting 
examples of negative selection inserts include a herpes simplex virus (HSV)-thymidine 
kinase (TK) gene. Cells containing an active HSV-TK gene are incapable of growing in 
the presence of gangcylovir or similar agents. Thus, depending on the substrate, some 

30 gene products can act as either positive or negative selection markers. 

The term "homologous recombination" refers to the exchange of DNA fragments 
between two DNA molecules or chromatids at the site of essentially identical nucleotide 
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sequences. It is understood that substantially homologous sequences can accommodate 
insertions, deletions, and substitutions in the nucleotide sequence. Thus, linear 
sequences of nucleotides can be essentially identical even if some of the nucleotide 
residues do not precisely correspond or align (see, above). 

5 A "knock-out" mutation refers to partial or complete loss of expression of at least 

a portion the target gene. Examples of knock-out mutations include, but are not limited 
to, gene-replacement by heterologous sequences, gene disruption by heterologous 
sequences, and deletion of essential elements of the gene (e.g., promoter region, portions 
of a coding sequence). A "knock-out" mutation is typically identified by the phenotype 

10 generated by the mutation. 

A "single-copy gene" as used herein refers to a gene represented in an organism's 
genome only by a single copy at a particular chromosomal locus. Accordingly, a diploid 
organism has two copies of the gene and both copies occur at the same chromosomal 
location. 

15 A "gene" as used in the context of the present invention is a sequence of 

nucleotides in a genetic nucleic acid (chromosome, plasmid, etc.) with which a genetic 
function is associated. A gene is a hereditary unit, for example of an organism, 
comprising a polynucleotide sequence (e.g., a DNA sequence for mammals) that 
occupies a specific physical location (a "gene locus" or "genetic locus") within the 

20 genome of an organism. A gene can encode an expressed product, such as a polypeptide 
or a polynucleotide (e.g., tRNA). Alternatively, a gene may define a genomic location 
for a particular event/function, such as the binding of proteins and/or nucleic acids (e.g., 
phage attachment sites), wherein the gene does not encode an expressed product. 
Typically, a gene includes coding sequences, such as, polypeptide encoding sequences, 

25 and non-coding sequences, such as, promoter sequences, poly-adenlyation sequences, 
transcriptional regulatory sequences (e.g., enhancer sequences). Many eucaryotic genes 
have "exons" (coding sequences) interrupted by "introns" (non-coding sequences). In 
certain cases, a gene may share sequences with another gene(s) (e.g., overlapping genes). 
"Isogenic" means two or more organisms or cells that are considered to be 

30 genetically identical. "Substantially isogenic" means two or more organisms or cells 
wherein, at the majority of genetic loci (e.g., greater than 99.000%, preferably more than 
99,900%, more preferably greater than 99.990%, even more preferably greater than 
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99.999%), there exists genetic identity between the organisms or cells being compared. 
In the context of the present invention, two organisms (for example, mice) are considered 
to be "substantially isogenic" if, for example, inserted transgenes are the primary 
differences between the genetic make-up of the mice being compared. Further, if, for 
5 example, the genetic backgrounds of the mice being compared are the same with the 
exception that one of the mice has one or several defined mutation(s) (for example, 
affecting coat color), then these mice are considered to be substantially isogenic. An 
example of two strains of substantially isogenic mice are C57BL/6 and C57BL/6-Tyr 
C2j/+. 

10 A "pseudogene" as used herein, refers to a type of gene sequence found in the 

genomes, typically, of eucaryotes, where the sequence closely resembles a known 
functional gene, but differs in that the pseudogene is non-functional. For example, the 
pseudogene sequence may contain several stop codons in what would correspond to an 
open reading frame in the functional gene. Pseudogenes can also have deletions or 

15 insertions relative to their corresponding functional gene. If, for example, in a genome 
there is a functional gene and a related pseudogene, the functional gene is considered to 
be a single-copy gene (accordingly, the pseudogene is considered to be single-copy as 
well). 

A "non-essential gene" refers to a gene whose deletion, disruption, elimination, 
20 reduction of gene function, or mutation is non-lethal, and does not obviously adversely 

affect the organisms' ability to mature and reproduce. A "non-essential gene with no 

phenotype" refers to a non-essential gene whose deletion, disruption, elimination, 

reduction of gene function or mutation has no deleterious effect on the organism. 

Typically there are no phenotypically reflected gene dosage effects associated with 
25 modification of a non-essential gene with no phenotype - for example, deletion, 

disruption or mutation of both copies of a non-essential gene with no phenotype in a 

diploid organism has essentially the same effect as deletion, disruption, or mutation of 

■ 

one of the two copies present in the diploid organism. In the context of the present 
invention, a non-essential gene is typically one whose function has been eliminated (e.g., 
30 by a deletion mutation) and such elimination of function was non-lethal and the organism 
developed, matured, and was able to reproduce. 

i 
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The "native sequence" or "wild-type sequence" of a gene is the polynucleotide 
sequence that comprises the genetic locus corresponding to the gene, e.g., all regulatory 
and open-reading frame coding sequences required for expression of a completely 
functional gene product as they are present in the wild-type genome of an organism. The 
5 native sequence of a gene can include, for example, transcriptional promoter sequences, 
translation enhancing sequences, introns, exons, and poly-A processing signal sites. It is 
noted that in the general population, wild-type genes may include multiple prevalent 
versions that contain alterations in sequence relative to each other and yet do not cause a 
discernible pathological effect. These variations are designated "polymorphisms" or 
10 "allelic variations." 

By "replacement sequence" is meant a polynucleotide sequence that is substituted 
for at least a portion of the native or wild-type sequence of a gene. 

"Linear vector" or "linearized vector," as used herein, is a vector having two 
ends. For example, circular vectors, such as plasmids, can be linearized by digestion 
15 with a restriction endonuclease that cuts at a single site in the plasmid. Preferably, the 
targeting vectors described herein are linearized such that the ends are not within the 
targeting sequences. 

General Overview 

20 The present invention relates to vector constructs and methods of creating 

transgenic animals to be used, for example, as test systems. Methods of using the 
animals of the present invention include, but are not limited to, using these animals for 
studies involving tumor growth and other disease conditions. In the practice of the 
present invention, transgenic, non-human mammals are constructed where a single-copy, 

25 non-essential gene is replaced by a reporter expression cassette, preferably a gene 

encoding a light-generating protein, such as a luciferase-encoding gene, operably linked 
to a promoter. A variety of promoters are useful in the practice of the present invention, 
for example, promoters derived from genes associated with tumorigenesis or 
angiogenesis. Thus, an exemplary promoter can be one that is associated with proteins 

30 induced during tumorigenesis, for instance in the presence of tumor generating 

compounds or of tumors themselves. In this way, expression of the reporter cassette is 
induced in the animal when, for example, tumors are present, and progression of the 
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tumor can be evaluated by noninvasive imaging methods using the whole animal. 
Another exemplary promoter is one that is derived from a gene associated with 
angiogenesis. Because the promoter is linked to a reporter such as luciferase, non- 
invasive monitoring of the progression of angiogenesis is possible. 
5 Various forms of the different embodiments of the invention, described herein, 

may be combined. 

Non-invasive imaging and/or detecting of light-emitting conjugates in 
mammalian subjects was described in U.S. Patent No. 5,650,135, by Contag, et al. v 
issued 22 July 1997. This imaging technology can be used in the practice of the present 

10 invention in view of the teachings of the present specification. In the imaging method, 
the conjugates contain a biocompatible entity and a light-generating moiety. 
Biocompatible entities include, but are not limited to, small molecules such as cyclic 
organic molecules; macromolecules such as proteins; microorganisms such as viruses, 
bacteria, yeast and fungi; eukaryotic cells; all types of pathogens and pathogenic 

15 substances; and particles such as beads and liposomes. In another aspect, biocompatible 
entities may be all or some of the cells that constitute the mammalian subject being 
imaged, for example, cells carrying the vector constructs of the present invention 
expressing a reporter expression cassette. 

Light-emitting capability is conferred on the biocompatible entities by the 

20 conjugation of a light-generating moiety. Such moieties include fluorescent molecules, 
fluorescent proteins, enzymatic reactions giving off photons and luminescent substances, 
such as bioluminescent proteins. In the context of the present invention, light emitting 
capability is typically confered on target cells by having at least one copy of a light- 
generating protein, e.g., a luciferase, present. In preferred embodiments, luciferase is 

25 operably linked to appropriate control elements which can facilitate expression of a 
polypeptide having luciferase activity. Substrates of luciferase can be endogenous to the 
cell or applied to the cell or system (e.g., injection into a transgenic mouse, having cells 
carrying a luciferase construct, of a suitable substrate for the luciferase, for example, 
luciferin). The conjugation may involve a chemical coupling step, genetic engineering of 

30 a fusion protein, or the transformation of a cell, microorganism or animal to express a 
light-generating protein. 
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Targeting Constructs 

The targeting cassettes described herein typically include the following 
components: (1) a suitable vector backbone; (2) a polynucleotide encoding a light 
generating protein (3) a promoter operably linked to the light generating protein- 
5 encoding gene, wherein the promoter is heterologous to the light generating protein 
coding sequences; (4) a sequence encoding a positive selection marker; (5) insertion sites 
flanking the sequence encoding the positive selection marker and the polynucleotide 
encoding a light generating protein gene, for insertion of sequences which target a 
single-copy, non-essential chromosomal gene; and, optionally, (6) a sequence encoding a 

10 negative selection marker. Exemplary targeting constructs are shown in Figures 3B, 5A 
and SB and described in Examples 1-3. 

Suitable vector backbones generally include an Fl origin of replication; a colEl 
pi as mid -derived origin of replication; polyadenylation sequence(s); sequences encoding 
antibiotic resistance (e.g., ampicillin resistance) and other regulatory or control elements. 

15 Non-limiting examples of appropriate backbones include: pBluescriptSK (Stratagene, La 
Jolla, CA); pBluescriptKS (Stratagene, La Jolla, CA) and other commercially available 
vectors. 

In one aspect of the invention the light generating protein is luciferase. 
Luciferase coding sequences useful in the practice of the present invention include, but 

20 are not limited to, sequences obtained from lux genes (procaryotic genes encoding a 
luciferase activity) and luc genes (eucaryotic genes encoding a luciferase activity). A 
variety of luciferase encoding genes have been identified including, but not limited to, 
the following: B.A. Sherf and K.V. Wood, U.S. Patent No. 5,670,356, issued 23 
September 1997; Kazami, J., et ah, U.S. Patent No. 5,604,123, issued 18 February 1997; 

25 S. Zenno, et al, U.S. Patent No. 5,61 8,722; K.V. Wood, U.S. Patent No. 5,650,289, 
issued 22 July 1997; K.V. Wood, U.S. Patent No. 5,641,641, issued 24 June 1997; N. 
Kajiyama and E. Nakano, U.S. Patent No. 5,229,285, issued 20 July 1993; MJ. Cormier 
and W.W. Lorenz, U.S. Patent No. 5,292,658, issued 8 March 1994; M.J. Cormier and 
W.W. Lorenz, U.S. Patent No. 5,418,155, issued 23 May 1995; de Wet, J.R., et al, 

30 Molec. Cell. Biol. 7:725-737, 1987; Tatsumi, H.N., et al, Biochim. Biophys. Acta 
1 131:161-165, 1992; and Wood, K.V., et al, Science 244:700-702, 1989. Eukaryotic 
luciferase catalyzes a reaction using luciferin as a luminescent substrate to produce light, 
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whereas prokaryotic luciferase catalyzes a reaction using an aldehyde as a luminescent 
substrate to produce light. 

Wild-type firefly luciferases typically have an emission maxima at about 550 nm. 
Numerous variants with differing emission maxima have also been studied. For 

5 example, Kajiyama and Nakano (Protein Eng. 4(6):691 -693, 1991; U,S. Patent No. 
5,330,906, issued 19 July 1994) teach five variant firefly luciferases generated by single 
amino acid changes to the Luciola cruciata luciferase coding sequence. The variants 
have emission peaks of 558 nm, 595 nm, 607 nm, 609 nm and 612 nm. A yellow-green 
luciferase with an emission peak of about 540 nm is commerically available from 

10 Promega, Madison, WI under the name pGL3. A red luciferase with an emission peak of 
about 610 nm is described, for example, in Contag et al. (1998) Nat. Med. 4:245-247 and 
Kajiyama et al. (1991) Prot. Eng. 4:691-693. 

Positive selection markers include any gene which a product that can be readily 
asssayed. Examples include, but are not limited to, a hprt gene (Littlefield, J. W., 

15 Science 145:709-710 (1964)), a xanthine-guanine phosphoribosyltransferase (gpt) gene, 
or an adenosine phosphoribosyltransferase (aprt) gene (Sambrook et al., supra), a 
thymidine kinase gene (i.e "TK") and especially the TK gene of herpes simplex virus 
(Giphart-Gassler, M. et al., Mutat. Res. 214:223-232 (1989)), a nptll gene (Thomas, K. 
R. et al., Cell 51:503-512 (1987); Mansour, S. L. et ah, Nature 336:348-352 (1988)), or 

20 other genes which confer resistance to amino acid or nucleoside analogues, or . 
antibiotics, etc, for example, gene sequences which encode enzymes such as 
dihydrofolate reductase (DHFR) enzyme, adenosine deaminase (ADA), asparagine 
synthetase (AS), hygromycin B phosphotransferase, or a CAD enzyme (carbamyl 
phosphate synthetase, aspartate transcarbamylase, and dihydroorotase). Addition of the 

25 appropriate substrate of the positive selection marker can be used to determine if the 
product of the positive selection marker is expressed, for example cells which do not 
express the positive selection marker nptll, are killed when exposed to the substrate 
G418 (Gibco BRL Life Technology, Gaithersburg, MD). 

The targeting vector typically contains insertion sites for inserting targeting 

30 sequences (e.g., sequences that are substantially homologous to the target sequences in 
the host genome where integration of the targeting vector/expression cassette is desired). 
These insertion sites are preferably included such that there are two sites, one site on 
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either side of the sequences encoding the positive selection marker, light generating 
protein (e.g., luciferase) and the promoter. Insertion sites are, for example, restriction 
endonuclease recognition sites, and can, for example, represent unique restriction sites. 
In this way, the vector can be digested with the appropriate enzymes and the targeting 
5 sequences ligated into the vector. 

Optionally, the targeting construct can contain a polynucleotide encoding a 
negative selection marker. Suitable negative selection markers include, but are not 
limited to, HSV-tk (see, e.g., Majioub et al. (1996) New Engl J. Med. 334:904-907 and 
U.S. Patent No. 5,464,764), as well as genes encoding various toxins including the 
. 10 diphtheria toxin, the tetanus toxin, the cholera toxin and the pertussis toxin. A further 
negative selection marker gene is the hypoxanthine-guanine phosphoribosyl transferase 
(HPRT) gene for negative selection in 6-lhioguanine. 

Exemplary promoters and single-copy, non-essential genes for use in the vector 
constructs and methods of the present invention are described below. 

15 

Promoters 

The targeting constructs and transgenic animals described herein contain a 
sequence encoding a light generating protein, e.g., luciferase, gene operably linked to a 
promoter. The promoter may be from the same species as the transgenic animal (e.g., 

20 mouse promoter used in construct to make transgenic mouse) or from a different species 
(e.g., human promoter used in construct to make transgenic mouse). The promoter can 
be derived from any gene of interest. In one embodiment of the present invention, the 
promoter is derived from a gene whose expression is induced during angiogenesis, for 
example pathogenic angiogenesis like tumor development. Thus, when a tumor begins 

25 to develop in a transgenic animal carrying a vector construct of the present invention, the 
promoter is induced and the animal expresses luciferase, which can then be monitored in 
vivo. 

Exemplary promoters for use in the present invention are selected such that they 
are functional in a cell type and/or animal into which they are being introduced. 
30 Exemplary promoters include, but are not limited to, promoters obtained from the 
following mouse genes: vascular endothelial growth factor ( VEGF) (VEGF promoter 
described in U.S. Patent No. 5,916,763; Shima et al. (1996)7. Bio. Chem. 271:3877- 
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3883; sequence available on NCBI under accession number U41 383); VEGFR2 f also 
known as Flk-1 , (VEGFR-2 promoter described, for example, in Ronicke et al. (1996) 
Circ. Res. 79:277-285; Patterson et al. (1995)7. Bio. Chem. 270:231 1 1-231 18; Kappel et 
al. (1999) Blood 93:4282-4292; sequence available as accession number X89777 of 
5 NCBI database); Tie2, also known as Tek (Tie2 promoter described, for example, in 
Fadel et al. (1998) Biochem. J. 338:335-343; Schlaeger et al. (1995) Develop. 121:1089- 
1098; Schlager et al. (1997) PNAS USA 94:3058-3063). VEGF is a specific mitogen for 
EC in vitro and a potent angiogenic factor in vivo. In a turnorigenesis study, it was shown 
that VEGF was critical for the initial subcutaneous growth of T-47D breast carcinoma 

10 cells transplanted into nude mice, whereas other angiogenic factors, such as, bFGF can 
compensate for the loss of VEGF after the tumors have reached a certain size (Yoshiji, 
H. t et al.,1997 Cancer Research 57: 3924-28). VEGF is a major mediator of aberrant EC 
proliferation and vascular permeability in a variety of human pathologic situation, such 
as, tumor angiogenesis, diabetic retinopathy and rheumatoid arthritis (Benjamin LE, et 

15 al.,1997 PNAS 94: 8761-66; Soker, S., et al.,1998 Cell 92: 735-745). VEGF is 

synthesized by tumor cells in vivo and accumulates in nearby blood vessels. Because 
leaky tumor vessels initiate a cascade of events, which include plasma extravasation and 
which lead ultimately to angiogenesis and tumor stroma formation, VEGF plays a pivotal 
role in promoting tumor growth (Dvorak, H.F.,etaI., 1991 J Exp Med 174:1275-8). 

20 VEGF expression was upregulated by hypoxia (Shweiki, D., et al., 1992 Nature 359: 
843-5). VEGF is also upregulated by overexpression of v-Src oncogene (Mukhopadhyay. 
D., et al.,1995 Cancer Res. 15: 6161-5), c-SRC (Mukhopadhyay, D., et al., 1995 Nature 
375: 577-81), and mutant ras oncogene (Plate, K.H., et al., 1992 Nature 359: 845-8). 
The tumor suppressor p53 downregulates VEGF expression (Mukhopadhyay. D., et 

25 al.,1995 Cancer Res. 15: 6161-5). 

A number of cytokines and growth factors, including PGF and TPA (Gnigel, S., 
et al., 1995 J. Biological Chem. 270: 25915-9), EGF, TGF-b, IL-1, IL-6 induce VEGF 
mRNA expression in certain type of cells (Ferrara, N„ et al., 1997 Endocr. Rev. 18: 4- 
25). Kaposi's sarcoma-associated herpesvirus (KSHV) encoded a G-protein-coupled 

30 receptor, a homolog of 11^8 receptor, can activate JNK/S APK and p38MAPK and 
increase VEGF production, thus causing cell transformation and tumorigenicity (Bais, 
C, et al., Nature 1998 391:86-9). VEGF overexpression in skin of transgenic mice 
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induces angiogenesis, vascularhyperpermeability and accelerated tumor development 
(Larcher, F., et al., Oncogene 1998 17:303-1 1). 

VEGF-B (cDNA sequences available on databases) is a mitogen for EC and may 
be involved in angiogenesis in muscle and heart (Olofsson, B., et al., 1996 Proc Natl 
5 Acad Sci U S A 93:2576-81 ). Shown in vitro, binding of VEGF-B to its receptor 
VEGFR-1 leads to increased expression and activity of urokinase type plasminogen 
activator and plasminogen activator inhibitor, suggesting a role for VEGF-B in the 
regulation of extracellular matrix degradation, cell adhesion, and migration (Olofsson, 
B., et al., 1998 Proc Natl Acad Sci U S A 95:1 1709-14). 

10 VEGF-C (see, e.g., U.S. Patent No. 5,916,763 and Shima et al., supra) may 

regulate angiogenesis of lymphatic vasculature, as suggested by the pattern of VEGF-C 
expression in mouse embryos (Kukk, E., et al., 1996 Development 122: 3829-37). 
Although VEGF-C is also a ligand for VEGFR-2, the functional significance of this 
potential interaction is unknown. Overexpression of VEGF-C in the skin of transgenic 

15 mice resulted in lymphatic, but not vascular, endothelial proliferation and vessel 

enlargement, suggesting the major function of VEGF-C is through VEGFR-3 rather than 
VEGFR-2 (Jeltsch M, et al., 1997 Science 276:1423-5). Shown by the CAM assay, 
VEGF and VEGF-C are specific angiogenic and lymphangiogenic growth factors, 
respectively (Oh, S.J., et al., (1997) Devel. Biol. 188: 96-109). VEGF-C overexpression 

20 in the skin of transgenic mice resulted in lymphatic, but not vascular, endothelial 

9 

proliferation and vessel enlargement (Jeltsch M, et al., 1997 Science 276: 1423-5). 

VEGF-D (cDNA sequences available on databases) is a mitogen for EC. Given 
that VEGF-D can also activate VEGFR-3. it is possible that VEGF-D could be involved 
in the regulation of growth and/or differentiation of lymphatic endothelium (Achen, 

25 M.G., et al., 1998 Proc Natl Acad Sci U S A 95: 548-53). VEGF-D is induced by 
transcription factor c-Fos in mouse (Orlandini, M., 1996 PNAS 93: 1 1675-80). 

VEGFR-1 signaling pathway may regulate normal endothelial cell-cell or cell 
matrix interactions during vascular development, as suggested by the knockout study 
(Fong, G.H., et al., 1995 Nature 376: 65-69). Although VEGFR-1 has a higher affinity 

30 to VEGF than VEGFR-2, it does not transduce the mitogenic signals of VEGF in ECs 
(Soker, S., et al.,1998 Cell 92: 735-745). VEGFR-2 (see, e.g., Ronicke et al., Patterson 
et al., Kappel et a). (1999), supra) appears to be the major transducer of VEGF signals in 
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EC that result in chemotaxis, mitogenicity and gross morphological changes in target 
cells (Soker, S., et al.,1998 Cell 92: 735-745). VEGFR-3 has an essential role in the 
development of (he embryonic cardiovascular system before the emergence of lymphatic 
vessels, as shown by the knockout study (Dumont, D.J., et al., 1998 Science 282: 946- 
5 949). Neuropillin-1 (see, e.g., Soker et al. (1998) Cell 92:735-745) is a receptor for 
VEGF165. It can enhance the binding of VEGF165 to VEGFR-2 and VEGF165 
mediated chemotaxis (Soker, S., et al.,1998 Cell 92: 735-745). Neuropillin I 
overexpression in transgenic mice resulted in embryonic lethality. The embryos 
possessed excess capillaries and blood vessels. Dilated vessels and hemorrhage were also 

10 observed (Kitsukawa, T., et al., 1995 Development 121 : 4309-18). 

Further promoters of interest include, but are not limited to, the following. Ang2 
is expressed only at predominant vascular remodeling sites, such as ovary, placenta, 
uterus (Maisonpierre, P.C., et al., 1997 Science 277: 55-60). In glioblastoma 
angiogenesis, Ang2 is found to be expressed in endothelial cells of small blood vessel 

15 and capilaries while Angl is expressed in glioblastoma tumor cells (Stratmann, A., 1998 
Am J Pathol 153: 1459-66). Ang2 is up-regulated in bovine microvascular endothelial 
by VEGF, bFGF, cyrokines, hypoxia (Mandriota, S.J., 1998 Circ Res 83: 852-9). Ang2 
transgenic overexpression disrupts angiogenesis, and is embryonic lethal (Maisonpierre, 
P.C, et al., 1997 Science 277: 55-60). Angl is widely expressed, less aboundant in heart 

20 and liver (Maisonpierre, P.C, et al., 1997 Science 277: 55-60). Angl is expressed in 
mesenchymal cells and may up-regulate the expression of Tie2 in the endothelial cells 
(Suri,C, et al., 1996 Cell 87: 1 171-1 180). Angl overexpression in the skin of transgenic 
mice produces larger, more numerous, and more highly branched vessels (Suri, C, et al., 
Science 1998 282:468-71). Tie2 (see, e.g., Fadel et al.; Schlaeger et al. (1995), and 

25 Schlager et al. (1997), supra) is edothelial cell specific, up-regulated during wound 
healing, follicle maturation (Puri, M.C., et al., 1995 EMBO J 14: 5884-91) and 
pathologic angiogenesis (Kaipainen, A., 1994 Cancer Research 54: 6571-77), such as, 
glioblastoma (Stratmann, A., 1998 Am J Pathol 153: 1459-66). Tie2 is also expressed in 
non-proliferating adult endothelium and edothelial cell lines (Dumont, DJ., et al (1994) 

30 Genes & Develop. 8: 1 897-1909). A Tie2 activating mutation causes vascular 
dysmorphorgenesis (Vikkula M, et al., 1996 Cell 87: 1 181-1 190). Tie2 mutant 
overexpression in transgenic mice is embryonic lethal (Dumont, D.J., et al., supra). 
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Other promoters useful in the practice of the present invention include, by way of 
example, promoters derived from the sequences encoding the following polypeptide 
products: PTEN (dual specificity phosphatase); BAI (brain-specific angiogenesis 
inhibitor); KAII (KANGAI 1); catenin beta-1 (cadherin-associated protein, beta); COX2 
5 (PTGS2 cyclooxygenase 2, a.k.a. prostaglandin-endoperoxide synthase 2); MMP2 (72 
kDa Type IV-A collagenase); MMP9 (92 kDa type IV-B collagenase); TTMP2 (tissue 
inhibitor of metalloproteinase 2); and TIMP3 (tissue inhibitor of metalloproteinase 3). 

PTEN is a tumor suppressor gene and encodes a protein of 403 amino acids. (Li 
et al. (1997) Science 275:1943-1946; DiCristofano et al. (\99i) Nature Genet. 19:348- 

10 355). Overexpression of PTEN has been shown to inhibit cell migration and it is 

postulated that this protein may function as a tumor suppressor by negatively regulating 
cell interactions with the extracellular matrix or by negatively regulating the 
PI3K/PKB/Akt signaling pathway. (Tamura et al. (1998) Science 280:1614-1617; 
Stambolic et al. (1998) Cell 95:29-29). Mutations in PTEN have been detected in cancer 

IS cell lines and in the germline of patients having Cowden disease, Lhermitte-Duclos 
disease and Bannayan-Zonana syndrome (diseases and syndromes which are 
characterized by hyperplastic/dysplastic changes in the prostate, skin and colon and 
which are associated with an increased risk of certain cancers, for example, breast 
cancer, prostate cancer and colon cancer), (Marsh et al. (1998) Hum Molec. Genet. 

20 7:507-515; Marsh et al. (1998) / Med. Genet 35:881-885; Nelen et al. (1997) Hum. 
Molec. Genet. 6:1383-1387). 

BAI1 protein is predicted to be 1 ,584 amino acids in length and includes an 
extracellular domain, an intracellular domain and a 7-span transmembrane region similar 
to that of the secretin receptor. (Nishimori et al. (1997) Oncogene 15:245-2150). The 

25 extracellular region of BAI 1 has a single Arg-Gly-Asp (RGD) motif recognized by 
integrins and also has five sequences corresponding to the thrombospondin type I 
(accession number 188060) repeats that can inhibit angiogenesis includes by basic 
fibroblast growth factor (bFGF, accession number 134920). Shiratsuchi et al. (1997) 
Cytogenet. Cell Genet. 79:103 : 108, cloned 2 other brain-specific angiogenesis inhibiting 

30 genes, designated BAI2 (accession number 602683) and BAI 3 (accession number 
602684). Thus, it is postulated that members of this gene family may play a role in 
suppression of glioblastoma. 



WO 01/18225 



PCTAJS99/30078 



29 

KAI1 encodes a 267 amino acid protein which is a member of the leukocyte 
surface glyoprotein family. The protein has 4 hydrophobobic transmembrane domains 
and 1 large extracellular hydrophilic domain with three potential N-glycosylation sites. 
(Dong et al. (1995) Science 268:884-886). Molecular analysis of KAI1 is described, for 

5 example, in Dong et al. (1997) Genomics 41 :25-32. KAII is a tumor metastasis 
suppressor gene that is capable of inhibiting the metastatic process in experimental 
animals. Expression of KAII is downregulated during tumor progression of prostate, 
breast, lung, bladder and pancreatic cancers in humans, apparently at the transcriptional 
or postranscriptional level. Mashimo et al. (1998) PNAS USA 95:1 1307-1 1311, found 

10 that the tumor suppressor gene p53 can directly inactivate the KAII gene by interacting 
with the region 5 V to the coding sequence, suggesting a direct relationship between pS3 
and KAII. 

Catenin beta- 1 is an adherens junction (AJ) protein, which are critical for 
establishing and maintaining epithelial cell layers, for instance during embryogenesis, 

15 wound healing and tumor cell metastasis. Molecular analysis, including description of 
sequence homology to plakoglobin (accession number 173325), homology to the 
drosophila gene "armadillo" and interactions with Lef 1/Tcf DNA binding proteins, is 
described, for example, in Nollet et al. (1996) Genomics 32:413-424; McCrea et al. 
(1991) Science 254:1359-1361 and Korinek et al. (1997) Science 275:1784-1787. In 

20 addition, studies by Korinek et al., supra and Morin et al.(1997) Science 275: 1787-1790, 
have indicated that APC (accession number 175100) negatively regulates catenin beta 
and that regulation of this protein is critical to the tumor suppressive effect of APC. 
Abnormally high levels of beta-catenin have been detected in certain human melanoma 
cell lines. (Rubinfeld et al. (1997) Science 275:1790-1792. Koch et al. (1999) Cancer 

25 Res. 59:269-273 report that childhood hepatoblastomas frequently carry a mutated 
degradation targeting box of the beta-catenin gene. Transgenic mice which express 
catenin beta under the control of an epidermal promoter undergo de novo hair 
morphogenesis and eventually these animals develop two types of tumors - epithelioid 
cysts and trichofolliculomas. Gat et al. (1998) Cell 95:605-61 4, 

30 COX2 encodes a cyclooxygenase and is a key regulator of prostaglandin 

synthesis. (Hla et al. ( 1 992) PNAS USA 89:7384-7388; Jones et al. ( 1 993) /. Biol. Chenu 
268:9049-9054). In particular, COX2 is generally considered to be a mediator of 
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inflammation and overexpression of COX2 in rat epithelial cells results in elevated levels 
of E-cadherin and Bcl2. (Tsujii & DuBois (1995) Cell 83:493-501). In co-cultures of 
endothelial cells and colon carcinoma cells, cells that overexpress COX2 produce 
prostaglandins, proangiogenic factors and stimulate both endothelial migration and tube 

5 formation. (Tsujii et al. (1998) Cell 93:705-71 6). Experiments conducted using APC 
knock-out mice have demonstrated that animals homozygous for a disrupted COX2 locus 
develop significantly more adenomatous polyps. (Oshima et al. (1996) Cell 87:803-809). 
COX-2 "knock out" mice develop severe nephropathy, are susceptible to peritonitis, 
exhibit reduced arachidonic acid-induced inflammation and exhibit reduced 

10 indomethacin-induced gastric ulceration. (Morham et al. (1995) Cell 83:473-482; 
Langenbach et al. (1995) Cell 83:483-492), Female mice that are deficient in 
cyclooxygenase 2 exhibit multiple reproductive failures. (Lim et al. (1997) Cell 91:197- 
208. 

MMP2 is a metalloproteinase that specifically cleaves type IV collagen. A C- 

15 terminal fragment of MMP2, termed PEX, prevents normal biding to alpha- V/beta-3 and 
disrupts angiogenesis and tumor growth. (Brooks et al, (1998) Cell 92:391-400). 

MMP9 is a collagenase secreted from normal skin fibroblasts. MMP9 null mice 
exhibit an abnormal pattern of skeletal growth plate vascularization and ossification. 
(Vu et al. (1998) Cell 93:41 1-422). 

20 TIMP2 is a collagenase and appears to play a major role in modulating the 

activity of interstitial collagenase and a number of connective tissue 
metalloendoproteases. (Stetler-Stevenson et al. (1989) J. Biol. Chem. 264:17372-17378). 
Unlike TIMP1 and TIMP3, TIMP2 is not upregulated by TPA or TGF-beta. (Hammani 
et al. (1996) J. Biol. Chem. 271 :25498-25505). 

25 TIMP3 (Wilde et al. (1994) DNA Cell Biol. 13:71 1-718) is localized in the 

extracellular matrix in both its glycosylated and unglycosylated forms. Studies of mutant 
TIMP3 proteins have demonstrated that C-terminal trunctions do not bind to the 
extracellular matrix. (Langton et al. (1998) J. Biol Chem. 273:16778-16781). 

As one of skill in the art will appreciate in view of the teachings of the present 

30 specification, promoter sequences can be easily derived and isolated from known 

polypeptide sequences or from cDNA or genomic sequences, using method known in the 
art in view of the teachings herein. An exemplary method of isolating promoter 
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sequences using cDNA is via a GenomeWalker® kit, commercially available from 
Clontech (Palo Alto, CA), and described on page 27 of the 1997-1998 Clontech catalog. 

Targeting Sequences: Non-Essential Genes 
5 Central to the present invention is the fact that the targeting constructs contain 

"targeting" sequences (flanking, for example, the light generating protein-encoding 
sequence and promoter) derived from a single-copy, non-essential gene. These targeting 
sequences in the construct act via homologous recombination to replace at least a portion 
of the non-essential gene in the genome with the light-generating protein-encoding (e.g., 

10 luciferase-encoding) sequence operably linked to a promoter. 

Non-limiting examples of targeting sequences for use in generating transgenic 
mice include sequences obtained from or derived from vitronectin, Fos B and galactin 3. 
A search of Mouse Knockout & Mutation Database (Genome Systems, Inc., St Louis, 
MO) can be used to identify genes that have been knocked-out in mice where the 

15 generated knockout mice displayed no obvious defects. The chromosomal locus for all 
these genes can be used to target promoter-(light generating protein, e.g., luciferase) 
transgenes similar to what is described in Example 2. Single-copy, non-essential mouse 
genes identified in this manner include, but are not limited to, the following: Moesin 
(Msn), Doi Y., et al., J Biol Chem 1999, 274:2315-2321 ; Plasminogen activator 

20 inhibitor, type II (Planh2) and Planhl, Dougherty K.M., Proc Natl Acad Sci USA 1999, 
96:686-691 ; Protein tyrosine phosphatase, receptor type, B (Ptprb), Elchebly et al. 
(1999) Science 283:1544-1548; Presenilin 1 (Psenl), Guo Q, et al. (1999) Proc Natl 
Acad Sci USA, 96:4125^130; Protein kinase, mitogen-activated 9 (Prkm9)/ 
SAPK/Erk/kinase 2 (Serk2), Kuan CY et al. (1999) Neuron 4:667-676; CD152 antigen 

25 (Cdl52) / CD86 antigen (Cd86) / CD80 antigen (Cd80), Mandelbrot DA, et al. (1999) J 
Exp Med, 189:435-440; Poly (ADP-ribose) polymerase (Adprp), Masutani M, et al. 
(1999), Proc Natl Acad Sci USA 96:2301-2304; Sodium channel, non voltage-gated 1 
beta (Scnnlb), Pradervand S, et al. (1999) Proc Natl Acad Sci USA 96: 1732-1737; 
Nuclear receptor coactivator 1 (Ncoal), Qi C, et al. (1999) Proc Natl Acad Sci USA 

30 96:1585-1590; Decay accelerating factor 1 (Dafl), Sun X, et al. (1999) Proc Natl Acad 
Sci USA 1999, 96:628-633; Necdin (Ndn), Tsai TF, et al. (1999) Nat Genet 22:15-16; 
Relaxin (Rln); Zhao L, et al. (1999) Endocrinology 140:445-453; Adenylyl cyclase 8 
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(Adcy8), Abdel-Majid RM, et al. (1998) Nat Genet 19:289-291; Leukemia inhibitory 
factor (Lif), Bugga L, et al. (1998) J Neurobiol 36:509-524; Lectin, galactose binding, 
soluble 3 (Lgals3) and Lgalsl, Calnot C, et al. (1998) Dev Dyn 21 1 :306-31 3; Urokinase 
plasminogen activator receptor (Plaur) Carmeliet P, et al. (1998) J Cell Biol 140:233- 
5 245; Nitric oxide synthase 1, neuronal (Nosl), Chao DS, et al. (1998) J Neurochem 
71:784-789; Homeo box A7 (Hoxa7), Chen F, et al. (1998) Mech Dev 77:49-57; 
Myosin light chain, phosphorylatable, cardiac ventricles (Mylpc) Chen J, et al. (1998) J 
Biol Chem 273:1252-1256; Homeo box B7 (Hoxb7),Chen F, et al. (1998) Mech Dev 
77:49-57; Nuclear factor of kappa light chain gene enhancer in B-cells inhibitor, alpha 

10 and beta (Nfkbia and Nfkbib), Cheng JD, et al. (1998) J Exp Med 6:1055-1062; Enolase 
1, alpha non-neuron (Enol), Couldrey C, et al. (1998) Dev Dyn 212:284-292; 
Xeroderma pigmentosum, complimentation group A (Xpa), De Vries A, et al. (1998) 
Exp Eye Res 1998, 67:53-59; Von Willebrand factor homolog (Vwf), Denis C, et al. 
(1998) Proc Natl Acad Sci USA 95:9524-9529; Lysosomal acid lipase 1 (Lipl), Du H, et 

15 al. (1998) Hum Mol Genet 7:1347-1354; UNC-5 homolog (C. elegans) 3 (Unc5h3), 
Eisenman LM, et al. (1998) J Comp Neurol 394:106-1 17; Protein phosphatase I, 
regulatory (inhibitor) subunit IB (Ppplrlb), Fienberg AA, et al. (1998) Science 
281:838-842; Myelin-associated glycoprotein (Mag) Fujita N, et al. (1998) J Neurosci 
18: 1970-1978; Paraoxonase 1 (Ponl), Furlong CE, et al. (1998) Neurotoxicology 

20 19:645-650; Brain derived neurotrophic factor (Bdnf), Garek RR, et al. (1998) 
Laryngoscope 108:671-678; Neurotrophin 3 (Ntf3), Garek RR, et al. (1998) 
Laryngoscope 168:671-678; Myoglobin (Mb); Garry DJ, et al. (1998) Nature 395:905- 
908; Opioid receptor, mu (Oprm), Gaveriaux-Ruff C, et al. (1998) Proc Natl Acad Sci 
USA 95:6326-6330; Neuropeptide Y (Npy), Hollopeter G, et al. (1998) Int J Obes Relat 

25 Metab Disord 22:506-512; Procollagen, type I, alpha 1 (Colal) Hormuzdi SG, et al. 
(1998) Mol Cell Biol 18:3368-3375; Centromere autoantigen B (Cenpb), Hudson DF, et 
al. (1998) J Cell Biol 141 :309-319; Oculocerebrorenal syndrome of Lowe (ocrl), Janne 
PA, et al. (1998) J Clin Invest 101:2042-2053; arachidonate 12-lipoxygenase (Aloxl2) 
Johnson EN, et al. (1998) Proc Natl Acad Sci USA 95:3100-3105; H19 fetal liver 

30 mRNA (HI 9), Jones BK, et al. (1998) Genes Dev 12:2200-2207; Hepatocyte nuclear 
factor 3 gamma (winged helix transcription factor) (Hnf3g), Kaestner KH, et al. (1998) 
Mol Cell Biol 18:4245-4251 ; Bone morphogenetic protein 2 (Bmp2) / Bone 
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morphogenetic protein 7 (Bmp7) and Bmp5, Katagiri T, ct al. (1998) Dev Genet 22:340- 
348; Intercellular adhesion molecule (Icaml), Ley K, et al. (1998) Circ Res 83:287-294; 
Glutamyl aminopeptidase (Enpep), Lin Q, et al. J Immunol 1998, 160:4681-4687; Prion 
protein (Pmp), Lipp HP, et al. Behav Brain Res 1998, 95:47-54; RAB3A, member RAS 
5 oncogene family (Rab3a), Lonart G, et al. Neuron 1998, 21:1 141-1 150; Potassium 
voltage gated channel, shaker related subfamily, member 4 (Kcna4), London B, et al., J 
Physiol (Lond) 1998, 509:171-182; Apurinic/apyrimidinic endonuclease (Apex), 
Ludwig DL, et al. Mutat Res 1998, 409:17-29; T-cell receptor gamma, variable 5 (Tcrg- 
V5), Mallick-Wood CA, et al. Science 1998, 279: 1729-1733; Nuclear, factor, erythroid 

10 derived 2, like 2 (Nfe212), Martin F f et al. Blood 1 998, 9 1 :3459-3466; Interleukin 1 3 
(1113), McKenzie GJ, et al. Curr Biol 1998, 8:339-340; Sorbitol dehydrogenase 1 (Sdhl), 
Ng T, et al. Diabetes 1998, 47:961-966; Guanine nucleotide binding protein, alpha 1 1 
(Gnall), Offermanns S, et al. EMBO J 1998, 17:4304-4312; Estrogen receptor alpha 
(Estra), Ogawa S, et al. Endocrinology 1998, 139:5070-5081; Integrin beta 2 (Itgb2), 

15 Intracellular adhesion molecule (Icaml) and CD34 antigen, Oliveira-dos-Santos AJ, et 
al. Eur J Immunol 1998, 28:2882-2892; Angiotensin receptor lb (Agtrlb), Oliverio MI, 
et al. Proc Natl Acad Sci USA 1998, 95:15496-15501 ; Complement factor B (factor B), 
Pekna M, et al. Scand J Immunol 1998, 47:375-380; Centromere autoantigen B (Cenpb), 
Perez-Castro AV, et al. Dev Biol 1998, 201:135-143; Procollagen, type V, alpha 2 

20 (Col5a2) / Fibrillin 1 (Fbn I), Phelps RG, et al. Mol Med 1 998, 4:356-360; Plasminigen 
activator inhibitor, type 1 (Planhl), Pinsky DJ, et al. J Clin Invest 1998, 102:919-928, 
Carmeliet P, et al. J Clin Invest 1993, 92:2756-2760; Placentae and embryos oncofetal 
gene (Pern), Pitman JL, et al. Dev Biol 1998, 202:196-214; Postmeiotic segregation 
increased 1 (S. cerevisiae) (Pmsl), Prolla TA, et al. Nat Genet 1998, 18:276-278; Prion 

25 protein, structural locus (Pm-p), Prusiner SB, et a). Proc Natl Acad Sci USA 1998, 
90:10608-10612, Lledo P-M, et al. Proc Natl Acad Sci U S A 1996, 93:2403-2407, 
Sailer A, et al. Cell 1994, 77:967-968, Weissmann C, et al. Philos Trans R Soc Lond 
[Biol] 1994, 343:431-433, BuelerH, et al. Cell 1993, 73:1337-1347, Weissmann C, et 
al. Intervirology 1993, 35:164-175; NAD (P)Hrquinone oxidoreductase, Radjendirane V, 

30 et al. J Biol Chem 1998, 273:7382-7389; Alpha tropomyosin (Tpml ), Rethinasamy P, et 
al. Circ Res 1998, 82:1 16-123; Goosecoid and Goosecoid-like (Gscl), Wakamiya M, et 
al. Hum Mol Genet 1998, 7:1835-1840, Saint-Pore B, et al. Hum Mol Genet 1998, 
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7:1841-1849; Schlafcn 1 (Slfnl); Schwarz DA, et al. Immunity 1998. 9:657-668; 
Nuclear factor, erythroid derived 2, ubiquitous (Mafic), Shavit JA, et al. Genes Dev 1998, 
12:2164-2174; Microphthalmia-associated transcription factor (Mitf), Smith SB, et al. 
Exp Eye Res 1998, 66:403-410; Bone morphogenetic protein 6 (Bmp6), Solloway MJ, et 
5 al. Dev Genet 1998, 22:321-339; Phosphatidylinositol glycan, class A (Piga), Takahama 
Y, et al. Eur J Immunol 1998, 28:2159-2166; Paired-related homeobox 2 (Prx2), Berge 
D, et al. Development 1998, 125:3831-3842; Prostaglandin E receptor EP1 subtype 
(Ptgerepl), Ushikubi F f et al. Nature 1998, 395:281-284; Immunoglobulin kappa chain 
complex (Igk), van der Stoep N, et al. Immunity 1998, 8:743-750; Adenine 

10 phosphoribosyl transferase (Aprt), Van Sloun PP, et al. Nucleic Acids Res 1998, 
26:4888-4894; Microtubule associated protein 4 (Mtap4), Voss AK, et al. Dev Dyn 
1998, 212:258-266; ; 3-hydroxy-3-methylglutaryl-coenzyme A lyase (Hmgcl), Wang 
SP, et al. Hum Mol Genet 1998, 7:2057-2062; Fibroblast growth factor receptor 4 
(Fgfr4), Weinstein M, et al. Development 1998, 125:3615-3623; Hepsin (Hpn), Wu Q, 

15 et al. J Clin Invest 1998, 101:321-326; Small inducible cytokine Al 1 (Scyal I), Yang T, 

■ 

et al. Blood 1998, 92:3912-3923; Small nuclear ribonucleoprotein N (Snrpn), Yang T, et 
al. Nat Genet 1998, 19:25-31; DNA fragmentation factor, alpha subunit (Dffa), Zhang J, 
et al. Proc Natl Acad Sci USA 1998, 95:12480-12485; Early growth response I (Egrl), 
Zheng D, et al. Neuroscience 1998, 83:251-258; Early growth response 1 (Egrl) / 

20 Hormone receptor (Hmr), Zheng D, et al. Neuroscience 1 998, 83:25 1-258; 

Hemochromatosis (Hfe),Zhou XY, et al. Proc Natl Acad Sci USA 1998, 95:2492-2497; 
Alpha tropomyosin (Tpml), Blanchard EM, et al. Circ Res 1997, 81:1005-1010; tRNA 
phosphoserine (Trsp), Bosl MR, et al. Proc Natl Acad Sci U S A 1997, 94:5531-5534; 
Angiotensin receptor lb (Agtrlb), Chen X, et al. Am J Physiol 1997, 272:F299-F304; 

25 Xeroderma pigmentosum, complementation group C (Xpc), Cheo DL, et al. Mut Res 
1997, 374:1-9; B cell leukemia/lymphoma 6 (Bcl6), Dent AL, et al. Science 1997, 
276:589-592; Fumarylacetoacetate hydrolase (Fah) / 4-hydroxyphenylpymvic acid 
dioxygenase (Hpd), Endo F, et al. J Biol Chem 1997, 272:24426-24432; N- 
methylpurine-DNA glycosylase (Mpg), Engel ward BP, et al. Proc Natl Acad Sci USA 

30 1997, 94:13087-13092; Interleukin 1 receptor, type 1 (Illrl), Glaccum MB, et al. J 
Immunol 1997, 159:3364-3371; N-methylpurine-DNA glycosylase (Mpg), Hang B, et 
al.Proc Natl Acad Sci USA 1997,94:12869-12874; Gamma-aminobutyric acid (GABA- 
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A) receptor, subunit alpha 6 (Gabra6), Homanics GE, et al. Mol Pharmacol 1997, 
5 1 :588-596; Superoxide dismutase 2, mitochondrial (Sod2), Huang TT, et al. Arch 
Biochem Biophys 1997, 344:424-432; Interleukin 1 ) receptor, alpha chain 1 (III lral), 
Nandurjar HH, et al. Blood 1997, 90:2148-2159; Alkaline phosphatase 5 (Akp5) f 
5 Narisawa S, et al. Dev Dyn 1 997, 208:432-^46; GATA-binding protein 4 (Gata4), Narita 
N, et al. Development 1997, 124:3755-3764; Lymphocyte protein tyrosine kinase (Lck) / 
Fyn protooncogene (Fyn); Page ST, et al. Eur J Immunol 1997, 27:554-562; P 
glycoprotein 3 (Pgy3), Schinkel AH, et al. Proc Natl Acad Sci U S A 1997, 94:4028- 
4033; P glycoprotein 1 (Pgyl) / P glycoprotein 3 (Pgy3), Schinkel AH, et al., Proc Natl 

10 Acad Sci U S A 1997, 94:4028-4033; Creatine kinase, mitochondrial 1 , ubiquitous 
(Ckmtl), Steeghs K, et al. J Neurosci Methods 1997, 71:29-41; T cell receptor gamma, 
variable 4 (Tcrg-V4), Sunaga S, et al. J Immunol 1997, 158:4223-4228; la-associated 
invariant chain (p31 form) (Ii), Takaesu NT, et al. J Immunol 1997, 158: 187-199; Solute 
carrier family 18 (vesicular monoamine), member 2 (Slcl8a2), Takahashi N, et al. Proc 

15 Natl Acad Sci USA 1997, 94:9938-9943; Matrix metalloproteinase 7 (Mmp7), Wilson 
CL, et al. Proc Natl Acad Sci U S A 1997, 94:1402-1407; Formin (Fmn),Wynshaw- 
Boris A, et al. Mol Med 1997, 3:372-384; Sypnaptophysin (Syp), Arrandale JM, et al. J 
Biol Chem 1996, 271:21353-21358; Transformation related protein 53 (Trp53), Boehme 
SA, et al. J Immunol 1996, 156:4075-4078; Neuronal nitric oxide synthase (Nosl), 

20 Burnett AL, et al. Mol Medicine 1996, 2:288-296; Eph receptor A2 (Epha2), Chen J, et 
al. Oncogene 1996, 12:979-988; Urokinase plasminogen activator receptor (Plaur); 
Dewerchin M, et al. J Clin Invest 1996, 97:870-878; Growth differentiation factor 9 
(Gdf9), Dong J, et al. Nature 1996, 383:53 1-535; Externally regulated phosphatase 
(Ptpnl6), Dorfman K, et al. Oncogene 1996, 13:925-931 ; Tenascin C (Tnc), Forsberg E, 

25 et al. Proc Natl Acad Sci U S A 1996, 93:6594-6599; Integrin alpha 1 (Itgal), Gardner 
H, et al. Dev Biol 1996, 175:301-313; FBJ osteosarcoma oncogene B (Fosb), Gruda 
MC, et al. Oncogene 1996, 12:2177-2185; Breast cancer 1 (Brcal), Hakem R, et al. Cell 
1996, 85:1009-1023; Megakaryocyte-associated tyrosine kinase (Matk), Haraaguchi I, et 
al. Biochem Biophys Res Commun 1996, 224:172-179; Apolipoprotein B editing 

30 complex 1 (Apobec 1 ), Hirano K-I, et al. J Biol Chem 1 996, 27 1 :9887-9890; Carboxyl 
ester lipase, Howies PN, et al. J. Biol. Chem. 1996, 271:7196-7202; Nuclear factor, 
erythroid derived 2, ubiquitous (Nfe2u), Kotkow KJ, Orkin SH. Proc Natl Acad Sci U S 
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A 1996, 93:3514-35 18; Retinoid X receptor gamma (Rxrg), Krezel W, et al. Proc Natl 
Acad Sci U S A 1996, 93:9010-9014; Early growth response 1 (Egrl), Lee SL, et al. 
Mol Cell Biol 1996, 16:4566-4572; Adrenergic receptor, alpha 2b (Adra2b), Link RE, et 
al. Science 1996, 273:803-805; Angiotensin receptor la (Agtrla), Matsusaka T, et ah J 

5 Clini Invest 1996, 98:1867-t877; Interleukin 3 receptor, beta chain 1 (H3rb2), Nicola 
NA, et al. Blood 1996, 87:2665-2674; Transformation related protein 53 (Trp53), 
Ohashi M, et al. Jpn J Cancer Res 1996, 87:696-701; Leukemia-associated gene (Lag), 
Schubart U K, et al. J Biol Chem 1996, 271 : 14062-14066; Glutathione peroxidase 1 
(Gpxl), Spector A, et al. Exp Eye Res 1996, 62:521-540; Arachidonate 12- 

10 lipoxygenase, leukocyte (Aloxl21), Sun D, Funk CD. J Biol Chem 1996, 27 1 :24055- 
24062; Complement component 3 (C3), Sylvestre D, et al. J Exp Med 1996, 184:2385- 
2392; CD30 antigen (Cd30), Texido G, et al. Eur J Immunol 1996, 26: 1966-1969; Fc 
receptor, IgE, low affinity II, alpha polypeptide (Fcer2a), Immunoglobulin heavy chain 5 
(delta-like heavy chain) (Igh-5), Interleukin 4 (IL4) and Terminal-deoxynucleotidyl 

15 transferase, Texido G, et al. Eur J Immunol 1996, 26: 1966-1 969; Apolipoprotein A-II 
(Apoa2), Weng W, Breslow JL. Proc Natl Acad Sci U S A 1996, 93:14788-14794; 
Amyloid beta (A4) precursor protein (App), Zheng H, et al. Ann N Y Acad Sci 1996, 
777:421-426; cAMP responsive element binding protein 1 (Crebl), Blendy JA, et al. 
Brain Res 1995, 681 :8-l4; Bradykinin receptor, beta 2 (Bdkrb2), Borkowski JA, et al. J 

20 Biol Chem 1995, 270: 1 3706-1 371 0; Growth factor response protein (Gfrp), Crawford 
PA, et al. Mol Cell Biol 1995, 15:4331-4336; Ciliary neurotrophic factor (Cntf), de 
Chiara TM, et al. Cell 1995, 83:313-322; Cyclin dependent kinase inhibitor 1 A (P21) 
(Cdknl a); Deng C, et al. Cell 1995, 82:675-684; Granzyme A (Gzma), Ebnet K, et al. 
EMBO J 1995, 14:4230-4239; Very low density lipoprotein receptor (Vldlr), Frykman 

25 PK, et al. Proc Natl Acad Sci U S A 1995, 92:8453-8457; Apolipoprotein E (Apoe) / 
Apolipoprotein A-I (Apoal), Goodrum JF, et al. J Neurobiol 1995, 64:408-416; Nitric 
oxide synthase 1, neuronal (Nosl), Ichinose F, et al. Anesthesiology 1995, 83: 101-108; 
Nitric oxide synthase 2, inducible, macrophage (Nos2), Laubach VE, et al. Proc Natl 
Acad Sci U S A 1995, 92:10688-10692; Peroxisome proliferator activated receptor alpha 

30 (Ppara), Lee SS-T, et al. Mol Cell Biol 1995, 15:3012-3022; Growth factor response 
protein (Gfrp), Lee SL, et al. Science 1995, 269:532-535; H19 fetal liver mRNA (H19) / 
Insulin-like growth factor 2 (Igf2), Leighton PA, et al. Nature 1995, 375:34-39; Retinoic 
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acid receptor beta (RAR-beta), Luo J, et al. Mech Dev 1995, 53:61-71; Metallothionein 
1 (Mt 1 ) / Metallothionein 2 (Mt2), Philcox JC, et al. Biochem J 1995, 308:543-546; 
Heme oxygenase (decycling) 2 (Hmox2), Poss KD, et al. Neuron 1995, 15:867-873; Hl- 
0 histone (Hlfv), Sirotkin AM, et al. Proc Natl Acad Sci U S A 1995, 92:6434-6438; 
5 Creatine kinase, mitochondrial 1, ubiquitous (Ckmtl), Steeghs K, et al. Biochim Biophys 
Acta 1995, 1230:130-138; Tenascin C (Tnc), Steindler DA, et al. J Neurosci 1995, 
15:1971-1983; la-associated invariant chain (Ii), Takaesu NT, et al. Immunity 1995, 
3:385-396; Neuroblastoma ras oncogene (Nras), Umanoff H, et al. Proc Natl Acad Sci 
USA 1995, 92:1709-1713; Receptor-associated protein of the synapse, 43 kDa (Rapsn), 

10 Willnow TE, et al. Proc Natl Acad Sci USA 1995, 92:4537-4541 ; Vitronectin (Vtn), 
Zheng X, et al. Proc Natl Acad Sci U S A 1995, 92:12426-12430; Preproacrosin (Acr), 
Baba T, et al. J Biol Chem 1994, 269:31 845-31849; Vimentin (Vim), Colucci GE, et al. 
Cell 1994, 79:679-694; Tumor necrosis factor receptor superfamily, member lb 
(Tnfrsf lb), Erickson SL, et al. Nature 1994, 372:560-563; Cellular retinoic acid binding 

15 protein I (Crabpl), Gorry P, et al. Proc Natl Acad Sci USA 1994, 91:9032-9036; cAMP 
responsive element binding protein 1 (Crebl), Hummler F, et al. Proc Natl Acad Sci 
USA 1994, 91:5647-5651; Pore forming protein (Pfp), Kagi D, et al. Nature 1994, 
369:31-37; Src-related kinase lacking C-terminal regulatory tyrosine and N-terminal 
myristylation sit (Srms), Kohmura N, et al. Mol Cell Biol 1994, 14:6915-6925; CD3 

20 polypeptide zeta (Cd3z) / Solute carrier family 22, member I (SIc22al), Koyasu S, et al. 
EMBO J 1994, 13:784-797; Pore forming protein (Pfp), Lowin B, et ah Proc Natl Acad 
Sci USA 1994, 91 :1 1571-1 1575; Retinoic acid receptor, beta (Rarb), Retinoic acid 
receptor beta2 (RARbeta2), Mendelsohn C, et al. Dev Biol 1994, 166:246-258; 
Transthyretin (Ttr), Palha JA, et al. J Biol Chem 1994, 269:33135-33139; Procollagen, 

25 type X, alpha I (CollOal), Rosati R, et al. Nature Genet 1994, 8: 129-135; P 

glycoprotein 3 (Pgy3), Schinkel AH, el al. Cell 1994, 77:491-502; Yamaguchi sarcoma 
viral (v-yes) oncogene homolog (Yes), Stein PL, et al. Genes Dev 1994, 8:1999-2007; 
Fc receptor, IgE, high affinity II, alpha polypeptide (Fcer2a), Stief A, et al. J Immunol 
1994, 152:3378-3390; Pore forming protein (Pfp), Walsh CM, et al. Proc Natl Acad Sci 

30 USA 1994, 91 : 10854-10858; CD2 antigen (Cd2), Evans CF, et al. J Immunol 1993, 
151 :6259-6264; Mannose-6-phosphate receptor, cation dependent (M6pr), Koster A, et 
al. EMBO J 1993, 12:5219-5223; Retinoic acid receptor, alpha (Rara), Li E, et al. Proc 
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Natl Acad Sci USA 1993, 90:1590-1594, Lufkin T, et al. Proc Natl Acad Sci USA 1993. 
90:7225-7229; Retinoic acid receptor, gamma (Rarg), Lohnes D, et al. Cell 1993, 
73:643-658; Tumor necrosis factor receptor 1 (TNF-R-1) (Tnfrl), Pfeffer K, et al. Cell 
1993, 73:457-467; Lectin, galactose binding, soluble 1 (Lgalsl), Poirier F, Robertson EJ. 
5 Development 1993, 1 19:1229-1236; Synapsin I (Synl), Rosahl TW, et al. Cell 1993, 
75:661-670; Tumor necrosis factor receptor 1 (Tnfrl), Rothe J, et al. Nature 1993, 
364:798-802; Beta-2 microglobulin (B2m), Correa I, et al. Proc Natl Acad Sci USA 
1992, 89:653-657; CD2 antigen (Cd2), Killeen N, et al. EMBO J 1992, 1 1 :4329-4336; 
Apolipoprotein E (Apoe), Piedrahita JA, et al. Proc Natl Acad Sci USA 1 992, 89:447 1 - 

10 4475; Myogenic differentiation I (Myodl ), Rudnicki MA, et al. Cell 1 992, 7 1 :383-390; . 
Tenascin C (Tnc), Saga Y, et al. Genes Dev 1992, 6: 1 821-1831 ; Beta-2 microglobulin 
(B2m), Sanjuan N, et al. J Virol 1992, 66:4587-4590; Neuroblastoma myc-related 
oncogene 1 (Nmycl), Stanton BR, et al. Genes Dev 1992, 6:2235-2247; and 
Hemoglobin alpha chain complex (Hba), Popp RA, et al. Genetics 1983, 105:157-167. 

15 Some preferred single-copy, non-essential genes with no phenotypes of the 

present invention include, but are not limited to, the following: Moesin (Msn), Doi Y., et 
al., J Biol Chem 1999, 274.2315-2321; Plasminogen activator inhibitor, type U (Planh2) 
and Planhl, Dougherty K.M., Proc Natl Acad Sci USA 1 999, 96:686-691 ; Nuclear 
receptor coactivator 1 (Ncoal), Qi C, et al. (1999) Proc Natl Acad Sci USA 96:1585- 

20 1590; Nuclear factor of kappa light chain gene enhancer in B-cells inhibitor, alpha and 
beta (Nfkbia and Nfkbib), Cheng JD, et al. (1998) J Exp Med 6:1055-1062; H19 fetal 
liver mRNA (H19), Jones BK, et al. (1998) Genes Dev 12:2200-2207; Prion protein 
(Prnp), Lipp HP, et al. Behav Brain Res 1998, 95:47-54; Centromere autoantigen B 
(Cenpb), Perez-Castro AV, et al. Dev Biol 1998, 201: 135-143; Placentae and embryos 

25 oncofetal gene (Pern), Pitman JL, et al. Dev Biol 1998, 202: 196-214; Externally 
regulated phosphatase (Ptpnl6), Dorfman K, et al. Oncogene 1996, 13:925-931 ; 
Transformation related protein 53 (Trp53), Ohashi M, et al. Jpn J Cancer Res 1996, 
87:696-701; Hl-0 histone (Hlfv), Sirotkin AM, et al. Proc Natl Acad Sci U S A 1995, 
92:6434-6438; Creatine kinase, mitochondrial 1, ubiquitous (Ckmtl), Steeghs K, et al. 

30 Biochim Biophys Acta 1995, 1230:130-138; Neuroblastoma ras oncogene (Nras), 
Umanoff H, et al. Proc Natl Acad Sci USA 1995, 92:1709-1713; Vitronectin (Vtn), 
Zheng X, et al. Proc Natl Acad Sci U S A 1995, 92:12426-12430; Vimentin (Vim), 
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Colucci GE f et a). Cell 1994, 79:679-694; Cellular retinoic acid binding protein I 
(Crabpl), Gorry P, et al. Proc Natl Acad Set USA 1994, 91:9032-9036; Retinoic acid 
receptor beta2 (RARbeta2), Mendelsohn C, et al. Dev Biol 1994, 166:246-258; Retinoic 
acid receptor, alpha (Rara), Li E, et al. Proc Natl Acad Sci USA 1993, 90:1590-1594, 

5 Lufkin T, et al. Proc Natl Acad Sci USA 1993, 90:7225-7229; Lectin, galactose binding, 
soluble 1 (Lgalsl), Poirier F, Robertson EJ. Development 1993, 1 19:1229-1236; 
Myogenic differentiation 1 (Myodl), Rudnicki MA, et al. Cell 1992, 71:383-390; and 
Tenascin C (Tnc), Saga Y, et al. Genes Dev 1992, 6:1821^1831. 

In view of the guidance of the present specification, one of ordinary skill in the 

10 art can select similar, suitable, single-copy, non-essential genes in mice and other cell 
types/organisms. 

Assembly of Targeting Cassettes 

The targeting cassettes described herein can be constructed utilizing 

15 methodologies known in the art of molecular biology (see, for example, Ausubel or 

Maniatis) in view of the teachings of the specification. As described above, the targeting 
constructs are assembled by inserting, into a suitable vector backbone, polynuclotides 
encoding a reporter, such as a light-generating protein, e.g., a luciferase gene, operably 
linked to a promoter of interest; a sequence encoding a positive selection marker; and, 

20 optionally a sequence encoding a negative selection marker. In addition, the targeting 
cassette contains insertion sites such that sequences targeting a single-copy, non-essential 
gene can be readily inserted to flank the sequence encoding positive selection marker and 
luciferase-encoding sequence. 

A preferred method of obtaining polynucleotides, suitable regulatory sequences 

25 {e.g., promoters) is PCR. General procedures for PCR as taught in MacPherson et al., 
PCR: A Practical Approach, (IRL Press at Oxford University Press, (1991)). PCR 
conditions for each application reaction may be empirically determined. A number of 
parameters influence the success of a reaction. Among these parameters are annealing 
temperature and time, extension time, Mg2+ and ATP concentration, pH, and the relative 

30 concentration of primers, templates and deoxyribonucleotides. Exemplary primers are 
described below in the Examples. After amplification, the resulting fragments can be 
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detected by agarose gel electrophoresis followed by visualization with ethidium bromide 
staining and ultraviolet illumination. 

In one embodiment, PCR can be used to amplify fragments from genomic 
libraries. Many genomic libraries are commercially available. Alternatively, libraries 
5 can be produced by any method known in the art. Preferably, the organism(s) from 
which the DNA is has no discernible disease or phenotypic effects. This isolated DNA 
may be obtained from any cell source or body fluid {e.g., ES cells, liver, kidney, blood 
cells, buccal cells, cerviovaginal cells, epithelial cells from urine, fetal cells, or any cells 
present in tissue obtained by biopsy, urine, blood, cerebrospinal fluid (CSF), and tissue 
10 exudates at the site of infection or inflammation). DNA is extracted from the cells or 
body fluid using known methods of cell lysis and DNA purification. The purified DNA 
is then introduced into a suitable expression system, for example a lambda phage. 

Another method for obtaining polynucleotides, for example, short, random 
nucleotide sequences, is by enzymatic digestion. As described below in the Examples, 
15 short DNA sequences generated by digestion of DNA from vectors carrying genes 
encoding luciferase (yellow green or red). 

Polynucleotides are inserted into vector genomes using methods known in the art. 
For example, insert and vector DNA can be contacted, under suitable conditions, with a 
restriction enzyme to create complementary or blunt ends on each molecule that can pair 
20 with each other and be joined with a ligase. Alternatively, synthetic nucleic acid linkers 
can be ligated to the termini of a polynucleotide. These synthetic linkers can contain 
nucleic acid sequences that correspond to a particular restriction site in the vector DNA. 
Other means are known and, in view of the teachings herein, can be used. 

25 The final constructs can be used immediately (e.g., for introduction into ES 

cells), or stored frozen (e.g., at -20°C) until use. Preferably, the constructs are linearized 
prior to use, for example by digestion with suitable restriction endonucleases. 
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Transgenic Animals 

The targeting constructs containing the light generating protein coding sequences 
(e.g., luciferase genes) are introduced into a pluripotent cell (e.g., ES cell, Robertson, E. 
J., In: Current Communications in Molecular Biology, Capecchi, M. R. (ed.), Cold 
5 Spring Harbor Press, Cold Spring Haibor, N.Y. (1 989), pp. 39-44). Suitable ES cells 
may be derived or isolated from any species or from any strain of a particular species. 

* 

Although not required, the pluripotent cells are typically derived from the same species 
as the intended reciepient. ES cells may be obtained from commercial sources, from 
International Depositories {e.g., the ATCC) or, alternatively, may be obtained as 

10 described in Robertson, E. J., supra. Examples of clonally-derived ES cells lines include 
129/S VJ ES cells, RW-4 and C57BL/6 ES cells (Genome Systems, Inc.). 

ES cells are cultured under suitable conditions, for example, as described in 
Ausubel et al., section 9.16, supra. Preferably, ES cells are cultured on stomal cells 
(such as STO cells (especially SNC4 STO cells) and/or primary embryonic fibroblast 

15 cells) as described by E. J. Robertson, supra, pp 71-1 12. Culture media preferably 
includes leukocyte inhibitory factor ("lif ) (Gough, N. M. et al., Reprod. Fertil. Dev. 
1:281-288 (1989); Yamamori, Y. et al., Science 246:1412-1416 (1989), which appears to 
help keep the ES ceils from differentiating in culture. Stomal cells transformed with the 
gene encoding lif can also be used. 

20 The targeting constructs are introduced into the ES cells by any method which 

will permit the introduced molecule to undergo recombination at its regions of 
homology, for example, micro-injection, calcium phosphate transformation, or 
electroporation (Toneguzzo, F. et al., Nucleic Acids Res. 16:5515-5532 (1988); Quillet, 
A. et al., J. Immunol. 141:17-20 (1988); Machy, P. et al., Proc. Natl. Acad, Sci. (U.S.A.) 

25 85:8027-803 1 (1988)). The construct to be inserted into the ES cell must first be in the 
linear form. Thus, if the knockout construct has been inserted into a vector as described 
above, linearization is accomplished by digesting the DNA with a suitable restriction 
endonuclease selected to cut only within the vector sequence and not within the knockout 
construct sequence. If the ES cells are to be electroporated to insert the construct, the ES 

30 cells and construct DNA are exposed to an electric pulse using an electroporation 

machine and following the manufacturer's guidelines for use. After electroporation, the 
ES cells are typically allowed to recover under suitable incubation conditions. The cells 
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are then cultured under conventional conditions, as are known in the art, and screened for 
the presence of the construct. 

Screening and selection of those cells into which the targeting construct has been 
integrated can be achieved using the positive selection marker and/or the negative 
5 selection marker in the construct. In preferred embodiments, the construct contains both 
positive and negative selection markers. In one aspect, methods which rely on 
expression of the selection marker are used, for example, by adding the appropriate 
substrate to select only those, cells which express the product of the positive selection 
marker or to eliminate those cells expressing the negative selection marker. For 

10 example, where the positive selection marker encodes neomycin resistance, G4I8 is 
added to the transformed ES cell culture media at increasing dosages. Similarly, where 
the negative selection marker is used, a suitable substrate {e.g., gancyclovir if the 
negative selection marker encodes HS V-TK) is added to the cell culture. Either before 
or after selection using the appropriate substrate, the presence of the positive and/or 

15 negative selection markers in a recipient cell can also be determined by others methods, 
for example, hybridization, detection of radiolabeled nucleotides, PCR and the like. In 
preferred embodiments, cells having integrated targeting constructs are first selected by 
adding the appropriate substrate for the positive and/or negative selection markers. Cells 
that survive the selection process are then screened by other methods, such as PCR or 

20 Southern blotting, for the presence of integrated sequences. 

After suitable ES cells containing the construct in the proper location have been 
identified, the cells can be inserted into an embryo, preferably a blastocyst. The 
blastocyts are obtained by perfusing the uterus of pregnant females. In one embodiment, 
the blastocyts are obtained from, for example, the FVB/N strain of mice and the ES cells 

25 are obtained from, for example, the CS7BL/6 strain of mice. Suitable methods for 

accomplishing this are known to the skilled artisan, and are set forth by, e.g., Bradley et 
al., (1992) Biotechnology, 10:534-539. Insertion into the embryo may be accomplished 
in a variety of ways known to the skilled artisan, however a preferred method is by 
microinjection. For microinjection, about 10-30 ES cells are collected into a micropipet 

30 and injected into embryos that are at the proper stage of development to permit 

integration of the foreign ES cell containing the construct into the developing embryo. 



WO 01/18225 PCT/US99/30078 



43 

The suitable stage of development for the embryo used for insertion of ES cells is species 
dependent, in mice it is about 3.5 days. 

While any embryo of the right stage of development is suitable for use, it is 
preferred that blastocysts are used. In addition, preferred blastocysts are male and, 

5 furthermore, preferably have genes encoding a coat color that is different from that 
encoded by the genes ES cells. In this way, the offspring can be screened easily for the 
presence of the knockout construct by looking for mosaic coat color (indicating that the 
ES cell was incorporated into the developing embryo). Thus, for example, if the ES cell 
line carries the genes for black fur, the blastocyst selected will carry genes for white or 

10 brown fur. 

After the ES cell has been introduced into the blastocyst, the blastocyst is 
typically implanted into the uterus of a pseudopregnant foster mother for gestation. - 
Pseudopregnant females are prepared by mating with vasectomized males of the same 
species and successful implantation usually must occur within about 2-3 days of mating. 

15 Offspring are screened initially for mosaic coat color where the coat color 

selection strategy has been employed. Southern blots and/or PCR may also be used to 
determine the presence of the sequences of interest. Mosaic (chimeric) offspring are then 
bred to each other to generate homozygous animals. Homozygotes and heterozygotes 
may be identified by Southern blotting of equivalent amounts of genomic DNA from 

20 mice that are the product of this cross, as well as mice that are known heterozygotes and 
wild type mice. Alternatively, Northern blots can be used to probe the mRNA to identify 
the presence or absence of transcripts encoding either the replaced gene, the light 
generating protein coding sequence (e.g., luciferase gene), or both. In addition, Western 
blots can be used to assess the level of expression of the luciferase protein with an 

25 antibody against the luciferase gene product. Finally, in situ analysis (such as fixing the 
cells and labeling with antibody) and/or FACS (fluorescence activated cell sorting) 
analysis of various cells from the offspring can be conducted using suitable (e.g., anti- 
luciferase) antibodies to look for the presence or absence of the targeting construct. 

In one embodiment of the present invention, the animals are from the C57BL/6 

30 mouse strain. This strain develops a variety of tumors and has been used to develop a 
number of tumor cells lines, for example, B 16 melanoma cells (including, B 1 6F10, 
B16D5, and B16FI), Lewis lung carcinoma cells (including, LLC, LLC-h59), T241 
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mouse fibrosarcoma cells, RM-1 and pTC2 mouse prostate cancer cells, and MCA207 
mouse sarcoma cells. These cell lines have been extensively used for in vivo tumor 
biology studies after injection into C57BL/6 mice. The generated targeted transgenic 
mice in the Examples are in CS7BL/6 genetic background and these animals are suitable 
5 for injection or implantation of such tumor cells, as well as other tumor cells described in 
literature that are immunocompatent for C57BL/6 mice. Thus, the transgenic animals 
can then be used, for example, to monitor, in vivo, tumor progression (e.g., growth) and 
the efficacy of therapies on tumor regression. For example, where the transgenic animal 
is tumor-susceptible, it is monitored for expression of a reporter, e.g., luciferase, which is 

10 indicative of tumorigenesis and/or angiogenesis. The monitoring of expression of 
luciferase reporter expression cassettes using non-invasive whole animal imaging has 
been described (Contag, C. et al, U.S. Patent No. 5,650,135, July 22, 1997; Contag, P., et 
al, Nature Medicine 4(2):245-247, 1998; Contag, C, et al, OSA TOPS on Biomedical 
Optical Spectroscopy and Diagnostics 3:220-224, 1996; Contag, C.H., et al, 

15 Photochemistry and Photobiology 66(4):523-53 1 , 1 997; Contag, C.H., et al, Molecular 
Microbiology 18(4):593-603, 1995). Such imaging typically uses at least one photo 
detector device element, for example, a charge-coupled device (CCD) camera. 

The transgenic animals described herein can also be used to determine the effect 
of an analyte (e.g., therapy), for example on tumor progression where the promoter 

20 induces light generating protein (e.g., luciferase) expression when a tumor develops. 
Methods of administration of the analyte include, but are not limited to, injection 
(subcutaneously, epidermally, intradermally), intramucosal (such as nasal, rectal and 
vaginal), intraperitoneal, intravenous, oral or intramuscular. Other modes of 
administration include oral and pulmonary administration, suppositories, and transdermal 

25 applications. Dosage treatment may be a single dose schedule or a multiple dose 
schedule. For example, the analyte of interest can be administered over a range of 
concentration to determine a dose/response curve. The analyte may be administered to a 
series of test animals or to a single test animal (given that response to the analyte can be 
cleared from the transgenic animal). 

30 The following examples are intended only to illustrate the present invention and 

should in no way be construed as limiting the subject invention. 
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EXAMPLES 
Example 1 

Generating the Targeting Cassette and Vector 

5 

A. Creation of the Backbone Vector 

pTK53: The 0.5 kb mouse phosphoglycerate kinase I promoter was amplified 
with PGK primers (PGKF, SEQ ID NO: 1 : 

ATCGAATTCTACCGGGTAGGGGAGGCGCTTT; PGKR, SEQ ID NO:2: 

1 0 GGCTGCAGGTCG AAAGGCCCGGAGATG AGG) using mouse genomic DNA 
(Genome Systems, Inc., St. Louis, MO) as template. This fragment was then double 
digested with EcoRI and PstI and cloned into the pKS vector (Stratagene, La Jolla, 
California) which was linearized with the same enzymes. The neomycin gene was 
amplified with NeoF (SEQ ID NO:3: 

15 ACCTGCAGCCAATATGGGATCGGCCATTGAAC) and NeoR (SEQ ID NO:4: 
GGATCCGCGGCCGCCCCCAGCTGGTTCTTTCCGCCTC) primers using 
pNTKV1907 (Stratagene) as a template. The 1.1 kb PCR fragment was double digested 
with PstI and BamHI and cloned into the pKS-PGK vector which was linearized with the 
same enzymes. This pKS-PGK-Neo vector was used to clone thymidine kinase gene as 

20 follows. Primers TKF (SEQ ID NO: 5: 

GGATCCTCTAGAGTCGAGCAGTGTGGTTTT) and TKR (SEQ ID NO.6: 
GAGCTCCCGTAGTCAGGTTTAGTTCGTCCG) were used to amplify the TK gene 
from pNTKV1907 (Stratagene). The amplified 2kb fragment was then digested with 
BamHI and Sad and cloned into pKS-PGK-Neo vector that was linearized with the same 

25 enzymes. This constructed vector was designated as pTK. A synthetic linker F5R5 was 
made after annealing of two primers (forward primer, F5R51, SEQ ID NO: 7: 
GTACATTTAAATCCTGCAGG; reverse primer, F5R52, SEQ ID NO:8: 
AGCTCCTGCAGGATTTAAAT). This linker was inserted between Asp718I and 
Hindlll sites of pTK and the new construct was designated pTK5. A second synthetic 

30 linker F3R3 was made by annealing of two primers (forward primer, F3R3 1 , SEQ ID 
NO:9: 

GGCCCGGGCTTAATTAATGCATCATATGGTACCGTTTAAACGCGGCCGCAAG 
CTTGTCGACGGCGCGCCGGCCGGCC; reverse primer, F3R32, SEQ ID NO: 10: 
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GATCGGCCGGCCGGCGCGCCGTCGACAAGCTTGCGGCCGCGTTTAAACGGTA 
CCATATGATGCATTAATTAAGCCCG). ). This linker was inserted between NotI and 
BamHI sites of pTK and the new construct was designated pTK53. Schematics of the 
vectors are shown are shown in Figure 1 . 

5 

B. Introduction of Luciferase 

pTK-LucYG and pTK-LucR: The yellow green luciferase gene was isolated 
from pGL3 vector (Promega) as a Hindlll-Sall fragment and was cloned into pGK53 that 
was linearized with the same enzymes. The new construct was designated pTK-LucYG 
10 (893 1 bp), shown in Figure 2. 

The red luciferase gene was isolated from pGL3-red vector (Dr. Christopher 
Contag, Stanford University, Stanford, Calif.) as a HindUI-Sall fragment and was cloned 
into pGK53 that was linearized with the same enzymes. The new construct was 
designated pTK-LucR (8931 bp), shown in Figure 2. 

15 

Example 2 

Insertion of Targeting Sequences 

• » 

A. Generation of vitronectin targeting vector: The targeting construct 
20 pTKLR-Vn was generated by inserting vitronectin (VN) DNA sequences into pTK-LucR 
vector. 

* 

Vitronectin (VN) is an abundant glycoprotein present in plasma and the 
extracellular matrix of most tissues. In a previous study, it was shown that heterozygous 
mice carrying one normal and one null VN allele and homozygous null mice completely 

25 deficient in vitronectin demonstrate normal development, fertility, and survival. This 
suggests that VN is not essential for cell adhesion and migration during normal mouse 
development (Zheng, X., et al., Proc Natl Acad Sci U S A 1995 92:12426-30). Mouse 
vitronectin genomic DNA sequence of 5004 bp was obtained from GenBank database 
(Accession number X72091). Based on this sequence, a 1.63 kb 3'end vitronectin 

30 fragment was amplified (reverse primer, VN1R, SEQ ID NO: 1 1 : 

CTGTATTTAAATCTGCCCACCCTATTCAGGACAGTAGTC; forward primer, 
VN1F, SEQ ID NO: 1 2: CCA ATGCATCAACCCAGCCAGGAGGAGTGCG) using 
mouse C57BL/6 genomic DNA as template (Genome Systems, Inc., St. Louis, MO). 
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This fragment was digested with Swal and Nsil and cloned into pTK-LucR (linearized 
with Swal and Sbfl). This construct was designated as pTK-LucR3. Subsequently, a 2.35 
kb 5'end vitronectin fragment was amplified (reverse primer, VN2R, SEQ ID NO: 13: 
AACGCGTCGACTTCGGAGATGTTTCGCKjGATAACCAGG; forward primer, 

5 VN2F,SEQIDNO:14: 

TTGGCGCGCCCCATAGAGAAGAGACACCAAAGGCACGCTC) using mouse 
C57BL/6 genomic DNA as template. This fragment was digested with Sail and AscI and 
cloned into pTK-LucR vector that was linearized with Sail and AscI. This construct was 
designated as pTKLR-Vn. Figure 3 shows the restriction map of pTKLR-Vn vector. The 

10 polylinker between the neomycin gene and red luciferase gene is used to insert the 
VEGF promoter or other promoters of interests. The predicted homologous 
recombination between pTKLR-Vn and vitronectin gene is illustrated in Figure 3A. 
Upon insertion of the VEGF-LucR transgene cassette, the endogenous vitronectin gene is 
destroyed. Figure 3B shows the genomic DNA sequence of VN. 

15 

B. Generation of Fos targeting vector: The targeting construct pTKLG-Fos was 
generated by inserting FosB DNA sequences into pTK-LucYG vector. 

FosB is one of the members of the Fos family. It plays a functional role in 
transcriptional regulation. It has been shown that FosB mice are born at a normal 
20 frequency, are fertile and present no obvious phenotypic or histologic abnormalities 
(Gruda et al (1996) Oncogene 12:2177-2185). A 28.8 kb genomic region that contains 
mouse FosB DNA sequence was obtained from GenBank database (Accession number 
AF093624). 

Using this sequence, a 1.71 kb 5'end FosB fragment was amplified (forward 

25 primer, FosB 1 F, SEQ ID NO: 1 5: 

CTGTATTTAAATCCCGTTTCTCACTGTGCCTGTGTC; reverse primer, FosBlR, 
SEQ ID NO: 16: GTCTCCTGCAGGCTTCCTCCTCCTTGTTCCTTGCG) using mouse 
C57B176 genomic DNA as template. This fragment was digested with Swal and Sbfl and 
cloned into pTK-Luc YG vector that was linearized with Swal and Sbfl. This construct 

30 was designated as pTK-LucYG3. Subsequently, a 1.58 kb 3 f end FosB fragment was 
amplified (forward primer, FosB2F, SEQ ID NO: 17: 

AACGCGTCGACGGATGGGATTGACCCCCAGCCCTC; reverse primer, FosB2R, 
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SEQ ID NO: 18: TTGGCGCGCCCCTTGCCTCCACCTCTCAAATGC) using mouse 
C57BL/6 genomic DNA as template. This fragment was digested with Sail and AscI and 
cloned into pTK-LucYG vector that was linearized with Sail and AscI. This construct 
was designated as pTKLG-Fos (Figure 4). The polylinker between the neomycin gene 

5 and red luciferase gene is used to insert the VEGFR2 promoter (Example 3, Figure 5 A), 
Tie2 promoter (Example 3, Figure 5B), as well as, other promoters of interests. The 
predicted homologous recombination between the targeting vector bearing the VEGFR2 
promoter (Figure 5A) or the Tie2 promoter (Figure SB) and FosB gene is also illustrated. 
As shown in the Figures, the VEGFR2-Luc YG transgene cassette and Tie2-Luc YG 

10 transgene cassette is inserted downstream of FosB gene translational stop signal. 

Therefore, the targeted transgenic mice should still have a functional FosB gene while 
expressing the transgenes. Figure 4B showns ihe DNA sequence of FosB. 

Example 3 

1 5 Insertion of Promoter Sequences of Interest 

A. pTKLR-Vn/VEGF: Mouse VEGF genomic DNA sequence of 2240 bp that 
contains a partial VEGF promoter region was obtained from GenBank (accession 
number: U41383). Accordingly, primers were designed to amplify a 0.69 kb (VFI- 
VR1 A; Table 1) and a 0.98 kb fragment (VF2-VR2; Table 1). It was confirmed that 

20 each pair of primers can amplify the predicted product using mouse 1 29SvJ genomic 
DNA as template. 
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Table I 



Name 


Seq Id No 


Sequence 


VF1 


19 


ACCTC ACTCT CCTGT CTCCC CTGAT TCCCA A 


VR1A 


20 


GCTCT GGCGG TCACC CCCAA AAGCA 


VF2 


21 


CCCTT TCCAA GACCC GTGCC ATTTG AGC 


VR2 


22 


ACTTT GCCCC TGTCC CI CI C TCTGT TCGC 


KF1 


23 


GCTGC GTCCA GATTT GCTCT CAGAT GCG 


KRI 


24 


TTCTC AGGCA CAGAC TCCTT CTCCG TCCCT 


KF2 


25 


CAGAT GGACG AGAAA ACAGT AGAGG CGTTG GC 


KR2 


26 


GAGGA CTCAG GGCAG AAAGA GAGCG 


TF3 


27 


AGCTT AGCCT GCAAG GGTGG TCCTC ATCG 


TF2 


28 


CAAAT GCACC CCAGA GAACA GCTTA GCCTG C 


TR1 


29 


GCTTT CAACA ACTCA CAACT TTGCG ACTTC CCG 



Conditions for PGR amplification are shown in Figure 6. These primers were 
5 used for PCR screening of mouse 129/SvJ genomic DNA BAC (bacterial artificial 
chromosome) library (Genome Systems, Inc., St. Louis, MO). The library, on average, 
contained inserts of 120 kb with sizes ranging between 50 kb to 240 kb. A large 
genomic DNA fragment that contained VEGF promoter region was obtained. Southern 
blot analysis was performed to map the VEGF promoter region. A unique Hindm 
10 restriction site was mapped approximately 7.8 kb upstream of the ATG translational start 
codon of the VEGF gene. The sequences between Hindm and ATG translational start 
codon are inserted into the polylinker of pTKLR-Vn vector to finish the construction of 
targeting vector that contains VEGF-LucR transgene (Figure 3A). 

15 B. pTKLG-Fos/VEGFR2 

A mouse VEGFR2 genomic DNA sequence of 1079 bp which contains a partial 

VEGFR2 promoter region was published previously (Ronicke, et a!., (1996) Cir. Res. 

79:277-285). Accordingly, primers were designed to amplify a 0.45 kb (KF1-KR1 A; 

Table 1) and a 0.58 kb fragment (KF2-KR2; Table 1 ). It was confirmed that each pair of 
20 primers can amplify the predicted product using mouse 129SvJ genomic DNA as 
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template. DNA sequences for these primers are shown in Table 1 above and PCR 
amplification conditions are shown in Figure 6. These primers were used for PCR 
screening of mouse 129/SvJ genomic DNA BAC library. From the screening, a BAC 
clone that contained a large genomic DNA fragment was obatined. Based on the 

5 published restriction map of VEGFR2, a 4.5 kb Hindm-Xbal fragment that covers the 
VEGFR2 promoter region was subcloned from the VEGFR2 BAC clone into the 
pBluescriptSK vector (Stratagene, La Jolla, CA) that was linearized with Hindlll and 
Xbal. The VEGFR2 promoter sequences of 4.1 kb, spanning from a Hindlll site to the 
ATG translational start codon, are subcloned into the polylinker of pTKLG-Fos vector to 

10 construct the targeting vector that contains VEGFR2-LucYG transgene (Figure 5A). 

C. pTKLG-Fos/Tie2 

Mouse Tie2 genomic DNA sequence of 477 bp which contain a partial Tie2 
promoter region has been published previously (Fadel et al (1998) Biochem. J. 330(Pt. 

15 1 );335-343). Accordingly, primers were designed to amplify a 0.45 kb (TF3-TR 1 ; Table 
1) and a 0.47 kb fragment (TF2-TR1; Table 1). It was confirmed that each pair of 
primers amplified the predicted product using mousel29SvJ genomic DNA as template. 
PNA sequences for these primers are shown in Table 1 above. PCR amplification 
conditions are shown in Figure 6. These primers are used for PCR screening of mouse 

20 I29SvJ genomic DNA BAC library. From the screening, a BAC clone containing a 
large genomic DNA fragment of the Tie2 promoter region was obtained. Based on the 
published Tie2 genomic DNA restriction map (see, Dumont et al., supra), a 10.5 kb 
Asp718-EcoRV fragment containing the Tie2 promoter region was subcloned ifrom the 
Tie2 BAC clone into the pSK vector that was linearized with Asp7 1 8 and EcoRV. The 

25 Tie2 promoter sequences of about 6.8 kb, spanning from the Asp718 site to the ATG 
translational start codon is subcloned into the polylinker of pTKLG-Fos vector to 
construct the Tie2-LucYG targeting vector (Figure 5B). 
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Example 4 . 

Generation of Transgenic Mice Carrying the Constructs of the Present 

Invention 

5 A. General Procedure: Figure 7 depicts a generalized description of generation 

of transgenic mice using the targeted transgenic vectors described in Example 3. Details 
regarding embryonic stem (ES) cell culture, transfection, blastocyst injection and 
implantation to a pseudopregnant foster arc described, for example, in Hogan et al (1994) 
"Manipulating the Mouse Embryo, A Laboratory Manual. Second Edition", Cold Spring 

10 Harbour Laboratory Press. 

After construction the targeted transgenic construct are transfected into C57BL/6 
embryonic stem (ES) cells. (Genome System Inc., Genome Systems, Inc., St. Louis, 
MO) through electroporation. The antibiotic G418 is used to select for cells in which the 
DNA construct containing the Neo gene is integrated, either randomly or by homologous 

IS recombination. The nucleoside analog gancyclovir is converted by TK to a cytotoxic 
derivative. DNA that has integrated by homologous recombination lose the TK gene and 
are resistant to the drug, whereas cells that have incorporated the DNA randomly are 
likely to retain the TK gene. Thus, cells containing random integrations into a 
chromosomal location that allows the expression of the TK gene are killed. The G41 8 

20 and gancyclovir resistant clones are then be screened by PCR and Southern blot analysis 
and those that have homologous DNA recombination is used for FVB/N blastocyst 
injection (Genome System, Inc.). Between 4-16 blastocysts are transferred to the uterus 
of a pseudopregnant foster mother. The pups are typically bom 17 days after the transfer. 
Either random bred mice or Fl hybrid mice make suitable recipients. Females of certain 

25 random-bred stocks (e.g., CD1 mice, from Charles River Laboratories) have very large 
ampullae, which makes oviduct transfer easier. These mice also generally make good 
mothers. Alternatively, Fl hybrid females (e.g.. B6 x CBA Fl) can be used as recipients. 
Although their ampullae are smaller, make exceptionally good mothers,rearing litters as 
small as two pups. See, for example, Hogan et al. (1994), supra. 
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B. Screening for homologous DNA recombination positive ES cells 
1). pTKLG-Fos/VEGFR2: Analysis of homologous DNA recombination between 
pTKLG-Fos/VEGFR2 targeting vector and the FosB gene is carried out using Southern 
blot analysis as shown in Figure 8. Genomic DNA prepared from G418 resistant ES 
5 cells is digested with PvuII and probed with probe A to confirm the 5'end DNA 

recombination. PvuII digestion of DNA bearing homologous recombination reveals two 
separate bands of 8.2 and 4.0 kb, whereas digestion of DNA from homologous 
recombination negative clones reveals only the 8.2 kb band. The 3'end of DNA 
recombination is tested by hybridizing NotI digested DNA with probe B, Notl digestion 
10 of DNA bearing homologous recombination will reveal two separate bands of >8.2kb 
and 5.0 kb, whereas digestion of DNA from homologous recombination negative clones 
will only reveal the >8.2 kb band. Once homologous DNA recombination is confirmed, 
positive clones is selected for FVB/N blastocyst injection. 

15 2) pTKLG-Fos/Tie2: Analysis of homologous DNA recombination between 

pTKLG-Fos/Tie2 targeting vector and the FosB gene is analyzed by Southern blot in a 
similar manner as described above for pTKLG-Fos/VEGFR2. Once homologous DNA 
recombination is confirmed, positive clones are selected for FVB/N blastocyst injection. 

20 3) PTKLR*Vn/VEGF: Analysis of homologous DNA recombination between 

pTKLR-Vn/VEGF targeting vector and the vitronectin gene is analyzed by PCR. DNA 
primers designed according to the predicted homologous recombination, are listed in 

» 

Table 2. 
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Table 2 

PCR primers for analysis of homologous DNA recombination between pTKLR 

Vn/VEGF targeting vector and the vitronectin gene 



S'end primers 

FS1 5'- CCCAGTGTCTCTGATTTAGGGAGAGCACCTGAG -3' SEQIDNO:30 
RSI S - CCAGACTGCCTTGGGAAAAGCGCCTC -3* SEQ ID NO:31 



F52 5 - CAGTGAGAGTCTTCTCTGTCCCTCAATCGGTTCTG -3' SEQ ID 

NO: 32 

RS2 5'- TGGATGTGGAATGTGTGCGAGGCCAG -3' SEQ ID NO:33 



3'end primers 

F31 5 - AATCAAAGAGGCGAACTGTGTGTGAGAGGTCC -3' SEQIDNO:34 
R31 5'- CGGCTCCCCAAAATGTGGAAGCAAGC -3' SEQ ID NO: 35 



F32 5 y - GAATCCATCTTGCTCCAACACCCCAACATC -3' SEQ ID NO:36 

R32 5'- CGCCTCCTCTCCCCAGTCTCCCCTTG -3' SEQ ID NO:37 



Primers F5I-R51 and F52-R52 amplify a 1799 bp and a 1841 bp DNA fragment 
respectively from the 5'end of the transgene that is integrated into the vitronectin site 
through homologous DNA recombination, whereas primers F3 1 -R3 1 and F32-R32 
10 amplify a 3549 bp and a 3428 bp DNA fragment respectively from the 3'end of the 
transgene that is integrated into the vitronectin site through homologous DNA 
•recombination. Clones that allow successful amplification of both the 5*end and 3'end 
of the integrated transgene are selected for FVB/N blastocyst injection. 



1 5 C. Analysis of chimeric mice 

The pups developed from injected blastocysts contain chimeras, as can be 
identified by their agouti coat color when an ES cell derived from a mouse having a dark 
coat color (e.g., C57BL/6) is injected into the blastocyst of a light coat color animal (e.g., 
FVB/N, genotype B/B). DNA analysis (e.g., Southern blotting, PCR) is conducted to 

20 further confirm the presence of the transgene in these pups as described above in Section 
B. These animals may be obtained commerically, for example from The Jackson 
Laboratory, Bar Harbor, MN. 
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D. Generating targeted transgenic C57BL/6 mice with white coat color 
Breeding of the chimeric mice generates homozygous targeted transgenic mice, 

as depicted in Figure 9. The targeted mice are used to monitor gene expression through 
the measurement of luciferase mediated light emission from the mice. In a preferred 
5 embodiment, the targeted mouse has a light coat color (e.g., white coat color), because 
the black colored coat (an example of a dark coat color) of C57BL/6 mice can absorb 
light emitted from the body and may interfere the sensitivity of the bioluminescence 
assay. An inbred mouse strain C57BL/6-Tyr C2j/+ strain (Jackson Laboratory, Bar 
Harbor, MN) is available for this purpose. This strain of mice have white color coat, yet 

10 they still have the same genetic background as C57BL/6 mice except that the gene 
responsible for the black coat color is mutated. Unfortunately, C57BL/6-Tyr C2j/+ ES 
cells are not currently available. Therefore, the designed breeding program illustrated in 
Figure 9 is aimed to generate mice that are homozygous for the target transgene and have 
white coat color. C57BL/6 ES cells are prepared as described above and introduced into 

15 a suitable blastocyst (e.g., from the FVB/N strain of mice). The blastocysts are 

implanted into a foster mother. Chimeric mice are shown in Figure 9 as white animals 
with black and green patches. Chimeric animals are bred with C57BL/6-Tyr C2j/+ mice 
to create Fl hybrids. Subsequent breeding of the Fl hybrids generates several type of 
mice, including the one that is homozygous for the target transgene and has a white coat 

20 color (shown in Figure 9 as b/b; L/L), which is used for in vivo gene regulation 
monitoring. 

A C57BL/6 mouse and a C57BL/6-Tyr C2j/+ mouse are considered to be 
substantially isogenic. Accordingly, the method of the present invention exemplified in 
Figure 9 provides a means for generating breeding groups of substantially isogenic mice 
25 in a selected genetic background carrying at least one transgene of interest. 

E. Dual luciferase targeted transgenic mice 

As described above, two targeting vectors are generated. PTKLR-Vn carries a 
red luciferase gene and is targeted into vitronectin locus. PTKLG-Fos carries a yellow- 
30 green luciferase gene and is targeted into FosB locus. A number of promoters, including 
VEGF promoter, VEGFR2 promoter, and Tie2 promoter are cloned into these vectors, as 
described above. Subsequently three type of targeted transgenic mice are generated. 



* 
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VEGF mice carry VEGF promoter-red luciferase transgene (VEGF-LucR) integrated 
into vitronectin locus. VEGFR2 mice carry VEGFR2 promoter-yellow-green luciferase 
(VEGFR2-LucYG) transgene integrated into FosB locus. Tie2 mice carry Tie2 promoter- 
yellow-green luciferase (Tie2-LucYG) transgene integrated into FosB locus. Through a 

5 breeding program illustrated in Figure 10, dual luciferase targeted transgenic mice are 
produced, carrying both of the VEGF-LucR and the VEGFR2-LucYG transgenes. The 
degradation of Iuciferin by yellow-green luciferase and red luciferase generates lights 
that emit at 540 nM and 610 nM respectively. These wavelengths of light are measured 
individually using a photo-counting camera (intensified CCD). Therefore, both VEGF 

10 expression and VEGFR2 expression, for example, can then be monitored in the same 
mouse at the same time. 

As is apparent to one of skill in the art, various modification and variations of the 
above embodiments can be made without departing from the spirit and scope of this 
15 invention. These modifications and variations are within the scope of this invention. 
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CLAIMS 

What is claimed is: 

1 . A vector for use in generating a transgenic non-human mammal, said mammal 
having at least one single-copy, non-essential gene in its genome, said vector comprising 

(a) a first selectable marker and a reporter expression cassette, said reporter 
expression cassette comprising a transcriptional promoter element operably linked to a 
light-generating protein coding sequence, and 

(b) targeting polynucleotide sequences homologous to a single-copy, non- 
essential gene in said mammal's genome, said targeting polynucleotide sequences 
flanking (a), wherein (i) the length of the targeting polynucleotide sequences are 
sufficient to facilitate homologous recombination between the vector and the single- 
copy, non-essential gene, and (ii) said transcriptional promoter element is heterologous 
to the single-copy, non-essential gene. 

2. The vector of claim 1, wherein said first selectable marker provides a positive 
selection. 

3. The vector of claim 2, wherein said first selectable marker is selected from the 
group consisting of neomycin phosphotransferase II, xanthine-guanine 
phosphoribosyltransferase, hygromycin-B-phosphotransferase, chloramphenicol 
acetyltransferase, and adeninephosphoribosyl transferase. 

4. The vector of claim 3, wherein the first selectable marker is neomycin 
phosphotransferase II. 

5. The vector of any of claims 1-4, further comprising a second selectable 
marker, wherein at least one target polynucleotide sequence is located between said 
second selectable marker and (a). 

6. The vector of claim 5, wherein said second selectable marker provides a 
negative selection. 
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7. The vector of claim 6, wherein said second selectable marker is selected from 
the group consisting of adenosine deaminase, thymidine kinase, and dihydrofolate 
reductase. 

8. The vector of any of claims 1-7, wherein the transcriptional promoter element 
is selected from the group consisting of an inducible promoter, a repressive promoter, 
and a constitutive promoter. 

9. The vector of claim 8, wherein the transcriptional promoter element is selected 
from the group consisting of VEGF, VEGFR, and TDB2. 

10. The vector of any of claims 1-9, wherein the sequences encoding the light- 
generating protein are obtained from either procaryotic or eucaryotic sources. 

1 1 . The vector of claim 10, wherein the light generating protein is a luciferase. 

1 2. The vector of any of claims 1-11, wherein said reporter expression cassette 
further comprises other control elements. 

13. The vector of claim 12, wherein said control elements are selected from the 
group consisting of transcription enhancer elements, transcription termination signals, 
polyadenylation sequences, sequences for optimization of initiation of translation, and 
translation termination sequences. 

14. The vector of any of claims 1-11, wherein said vector is circular. 

15. The vector of claim 14, wherein said vector contains at least one restriction 
site whose cleavage results in a linear vector having the following arrangement of 
elements: targeting polynucleotide sequence - (a) - targeting polynucleotide sequences. 

1 6. The vector of claim 5, wherein said vector is circular. 
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17. The vector of claim 16, wherein said vector contains at least one restriction 
site whose cleavage results in a linear vector having the following arrangement of 
elements: target polynucleotide sequence - (a) - targeting polynucleotide sequences - 
(second selectable marker). 

18. The vector of any of claims 1-17, wherein the coding sequences of the 
reporter expression cassette comprise codons that are optimal for expression in a host 
system into which the expression cassette is to be introduced. 

19. The vector of claim 18, wherein said mammal is a rodent. 

20. The vector of claim 19, wherein said mammal is a mouse. 

21 . The vector of any of claims 1-20, wherein said targeting polynucleotide 
sequences from single-copy, non-essential genes are selected from the group consisting 
of vitronectin, /<wfl, and galactin 3. 

22. A method of producing a transgenic, non-human mammal, said mammal 
having at least one single-copy, non-essential gene in its genome, comprising 

transfecting an embryonic stem cell of said mammal with a linear vector 
comprising 

(a) a first selectable marker and a reporter expression cassette, said reporter 
expression cassette comprising a transcriptional promoter element operably linked to a 
light-generating protein coding sequence, and 

(b) targeting polynucleotide sequences homologous to a single-copy, non- 
essential gene in said mammal's genome, said targeting polynucleotide sequences 
flanking (a), wherein (i) the length of the polynucleotide sequences are sufficient to 
facilitate homologous recombination between the vector and the single-copy, non- 
essential gene, and (ii) said transcriptional promoter element is heterologous to the 
single-copy, non-essential gene; 
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selecting embryonic stem cells which each have said first selectable marker and 
reporter expression cassette integrated into its genome; 

injecting said embryonic stem cells into a host embryo, 
implanting said embryo in a foster mother, * 

maintaining said foster mother under conditions which allow production of an 
offspring that is a transgenic, non-human mammal carrying said reporter expression 
cassette. 

23. The method of claim 22, wherein said offspring is capable of germline 
transmission of said reporter expression cassette. 

24. The method of claim 23 further comprising breeding said offspring with a 
mamma), wherein the mammal is substantially isogenic with the embryonic stem cells, 
wherein said breeding yields transgenic Fl offspring carrying said reporter cassette. 

25. The method of claim 24, further comprising breeding a First Fl offspring 
carrying said reporter cassette with a second Fl offspring carrying said reporter cassette, 
wherein said breeding yields transgenic F2 offspring carrying said reporter cassette. 

26. The method of any of claims 22-25, wherein said mammal is a mouse. 

27. The method of claim 26, wherein said embryonic stems cells are derived 
from a mouse having a dark coat color. 

28. The method of claim 27, wherein said mammal substantially isogenic with 
the embryonic stem cells has a light coat color. 

29. The method of claim 28, wherein said F2 offspring carrying said reporter 
cassette has a light coat color. 

30. The method of claim 29, wherein said embryonic stems cells are derived 
from a C57BL/6 mouse having a dark coat color, and said mammal substantially 
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isogenic with the embryonic stem cells is a C57BL/6-Tyr C2J/+ mouse having a light 
coat color. 

* 

31 . The method of any of claims 22-30, wherein the light generating protein is a 
luciferase. 

32. A transgenic, non-human mammal, comprising at least one single-copy, non- 
essential gene in its genome, wherein (i) at least a portion of at least one single-copy, 
non-essential gene is replaced by polynucleotide sequences heterologous to the gene, 
such that the gene cannot produce a functional gene product, and (ii) said polynucleotide 
sequences comprise a first expression cassette which has been introduced into said 
mammal or an ancestor of said mammal, at an embryonic stage, said first expression 
cassette comprising 

a first selectable marker, . 

a first transcriptional promoter element heterologous to the gene, and 
light-generating protein coding sequences, wherein said light-generating protein 
coding sequences are operably linked to said promoter element. 

33. The transgenic, non-human mammal of claim 32, wherein the single-copy, 
non-essential gene is selected from the group consisting of vitronectin./orf?, and galactin 

3. 

34. The transgenic, non-human mammal of any of claims 32-33, wherein the 
first selectable marker is selected from the group consisting of neomycin 
phosphotransferase II, xanthine-guanine phosphoribosyl transferase, hygromycin-B- 
phosphotransf erase, chloramphenicol acetyltransferase, and adeninephosphoribosyl 
transferase. 

35. The transgenic, non-human mammal of any of claims 32-34, wherein the 
first transcriptional promoter element is selected from the group consisting of an 
inducible promoter, a repressible promoter, and a constitutive promoter. 
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36. The transgenic, non-human mammal of claim 35, wherein the first 
transcriptional promoter element is selected from the group consisting of VEGF, 
VEGFR, and 1TE2. 

37. The transgenic, non-human mammal of any of claims 32-36, wherein said 
expression cassette further comprises other control elements. 

38. The transgenic, non-human mammal of claim 37, wherein said control 
elements are selected from the group consisting of transcription enhancer elements, 
transcription termination signals, polyadenylation sequences, sequences for optimization 
of initiation of translation, and translation termination sequences. 

39. The transgenic, non-human mammal of any of claims 32-38, wherein said 
non-human mammal comprises a second single-copy, non-essential gene in its genome, 
wherein (i) at least a portion of the second single-copy, non-essential gene is replaced by 
polynucleotide sequences heterologous to the second gene, such that the second gene 
cannot produce a functional gene product, and (ii) said polynucleotide sequences 
comprise a second expression cassette which has been introduced into said mammal or 
an ancestor of said mammal, at an embryonic stage, said first expression cassette 
comprising 

a second selectable marker, 

a second transcriptional promoter element heterologous to the second gene, and 
light-generating protein coding sequences, wherein said light-generating protein 
coding sequences are operably linked to said second promoter element. 

40. The transgenic, non-human mammal of claim 39, wherein the first 
transcriptional promoter element in the first expression cassette is different from the 
second transcriptional promoter element in the second expression cassette. 

41. The transgenic, non-human mammal of claim 40, wherein the light- 
generating protein in the first expression cassette can produce a different color of light 
relative to the light-generating protein in the second expression cassette. 
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42. The transgenic, non-human mammal of any of claims 32-41 , said mammal 
being a rodent. 

43. The transgenic, non-human mammal of 42, said rodent being a mouse. 

44. The transgenic, non-human mammal of any of claims 32-43, wherein the 
sequences encoding the light-generating protein are obtained from either procaryotic or 
eucaryotic sources. 

45. The transgenic, non-human mammal of claim 44 wherein the light- 
generating protein is a luciferase. 
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1 tcl=wr c * c c iwntiuc crooocoooc ttocxmta actotoct taaagaacoc t ow coaaoo xflocrorooT oocacacaao GAJGQQTGCT 

ACGTOOTX ACXAAGACTQ CAOOOOCCOO AAGCA2CAA* TSUGTAOCA ATTTCTTOQC AG l O UUO O U C TCOCACAOCA COglCfCTTC CTTOXAOSA 



lei <n»c Mfl cc cccao tsict ctgatttago gagagcacct (wnaxAum agaotcttct cxqtooctca atgqotxctq aaaxtoocca cttoccctcc 
aaororrooo aocioaa gactaaatoc cictcotooa crcaancAc tctcagaaga gacaqosagt ^tvyf rmmaxsr 



201 llAI OAOqC GACAGBQCXG CXXAGOCTAT TCAGSACAGT AGTCTXAAAC TOCTAGOCAA CAGACTTTTT ATTOOOCTOS CAGAAACACA TGAOQCTGCT 
AATAOTrcCC CTCTODCCAC GQCTGQGA3A AGXOCTCTCA TCAGAAtTIG AOCATCOQTT GTCTGAAAAA TAAIXTXACC CIUHUI T ACTCCGAGGA 

101 CAAGC tXAOC CCkcraaOCT CHAVHX T* CTXCTCAGAO OTCOOXAGC COOCCAAT* C1GAQCAA1B fl ttUimff AGOGAGGXTT cjoaroc 
CTTCGAGT0O OCTCACCCGA GACTAAGGAT GAACACTCTC CA gJttXK C G OCTCQCTM CACIOCTXAC CTOOCACCCA TCOCTOCtXA CTCTCtCAOQ 

401 ACTO0OOQ06 nCtAAGCTT GACT0001AO tATTTOTCIQ AAAGAAACAA TOCAAAAAQg OTtATGTCAG ATTCTOGCTC AtCCTCTOCA I lOOT CQB U 

t c a coooooc aagaitocaa ctgaoxatc ataaacagac mcnrcrr icctrmcc caatacactc ^^^^ taggacaqot cacoooot t 
soi cuaan u oqctpttict ocaaoogaa actcaacasc caccaaocao atxatcicac catctacaoq cigiqtica o *»*^^nft ccaagacct? 

CTTCCTATTT O O SAAAAACA OTCTTCCCTT TCACtTCtXQ OTOOTTCOTC TATTACAGXO OTACKTCTOC GACACAAG1C OlUJL/IUXT GOnCIQGAC 

(01 C A COC AA g X CTACOCAAAA CCAGTCtAAO OAGTAGAAAG OOOCTOOCAC CTCCAGAGAA GAAA3ACACQ CICTCAAXGO GCTOOCMX7T OQauSXACA 
OTCCOTtCOO OATCOCTTTT QOTGAGATTC CtPHCITlC COCQAOGOTQ GAOGTCICXT CTXTATCTOC GAGACTTACC CGAOCOTCCA UWHJU OOT 

701 AOQCAOT OCA TATCATAATC ATAOTTOTTQ WGOTTCCTA OOCCACtCIC CTCOCTOGAG AACAAAGAGA ACCACATTSA ACSIGA1GAA CGAOQQGAGT 
TCOCTCAOCT ATACTATIAG TATGAACAAC ATCCAAOGAT G0071GAGAG GAOGGACCTC ItOTWClCT TOOTCTAACT TQCACTACTT OCTOCCCTCA 

801 T OSAOC PCW OCTOOCTCTQ TQ00CAC00C C T UAJUUllA ACCATAOCQC TllUAA.I ' lU TA CUUITA QA UlUimt TIVJULTlUAi CAOA0RXSA 
AGCTOGAGAC OGACXEACAC ACOQOTOGOO CA3DCOCACT TOCTA1UJUU AAAGOCGAAO ATOCCAATCT. CAACACAAAA AACCCAAOCC OICTCAOOCT 

901 T AAfttACCCA OXGAOOOCA TOO0Q0000C CATAGCAOOQ TOOtf3TTBC C1GQCACACC ASOOCAOrnC COQRCATGA ATOOO&tTC TCTQOCXGCA 
ATTOTCQGT CACttJCftTCT ACQ0OOOCC0 CTATODTOOC AOC1CAAAOO CACOOT O T OO ttGOGXCAAC GOOSACOCT »n/yTA*'7 AGACCGAGOT 



1001 TOCT AACAO OCAAOOOtTTT AA3GCAGTTC CEAGATtCTO OimUA TIT CICCAGCAAG OTTOTCTCTC TATCTATITA U1A11U1A TCBUCtATC 
AGACATTCTC CCI IUtL AA TOO0TCAAC OOXCTAACAC OSAAACTAAA OOCTOCTXC CMCACAOWC A1XCATAAAT ACATACAAAT ACAtACAM 

1101 TATCTATITA TCTATOWTC 1XTCTXTCTA TCATCTAOCT AOCTACTOC CTXTCTATOT ATCTATCTA7 CXATCATCTA OCTAOCTACT TACCTATCTA 
AXACATAXAT ACAT5ACAOO AfXMBAXAQAT AOnflATOOA TOQATQAAZQ OATAGA 1 ACA lAGAtAOATA CAXAOTACAT OGAtOSATBA ATOCAZACAT 

U01 CC1A1 1 1ATT TOTTTCTJTC TXTTCTTTCA AACAOSATCT TM3CACCTAC CDVTOOCXQS TTZOCAACTC ACTATGAAOC CATAACTOOC CTCTDSACTC 
GGATAAA3AA ACAAACAAAC AAAAGAAACT TJCTCCTACA ATOOTQQATO GA1AO0GACC AAACOITCAO TCATACTTOO GTATTCACOO QAGAA3TCAC 

1301 ACAAACAT OC ACTTOCCTCT CXCTCXCAOT OCTQOBATXA AAAOCATOXO OCA Ct ACAOC CAOCTOCACT AOCACCTRA GAACACATTT OCTATQCCTT 
TOnTCTAflO TGAACOQACA CAAACACICA OCAOCCTAAT TTOOTACAC OCTCATGTOC CX0CA09ICA T0CTOSAAAT CTTGTOTAAA «w«TM 



1401 OOC T AACACA CACAAC TCAO TO0OCA0CCC OCAOOCTOOC TOICtACAOC TTXTTOOCAT CCTCXCTCCA CTSTATOOCT TCAATCTCT0 

CnCATTCTGT CXOXTCACTC AOOOglOOOC 00T00BAO0B ACACATCTQO AAAAAOOOTA OCACMSAOTr GACATA03CA ACTTAGAGAC UJUU1AQOCT 

1S01 AAOCCCTCAC fXTTT ^ OXC CTOCXTCTOC TOTCTTAOQC AAACICCAAG CTATQQCATC CAAATASAOC CAAOOCTCAT TCAACACAAC 
TTOOOGACTC Q00C3CICO0O CAflCAACA O O ACACAATOS TTTCACGTTC CATACOCXAC OUTAlL ' AUi CTtCOCAOtA CLUUUlTt ' lL AOTtOTCTTC 

1601 CAAACT CPC CCAflAfaCAAA CAORCTTCA TCG1TQCJTCT CACAOTttXA OB0CCCTO0C CTOGfcAOCEC CCACtATCAC A0COCA0TTT CCACACAAAO 

oxncACATc ocTCrocrrr oicgacaact aocwocaca orcxauwocrr ocooogaooo gaocttoooc octcataots tqoogtcaaa ooicictttc 



1701 AAODCAOOC T TOCICTOOCT CCAXMOCACA OCATCTOQOC CACAACAflGA OTPOCAAAAT OTfCtCOC AO CTOXGOCQCT GAA0CAA0OC AAAGTOCTCA 
naXTTQKA A ffl A fl A no CA OOtXTOCTCT OCTAnAOnOQ <;iVnCiU. T CAAOC1TT1X CAAGAOORC CACMQOQQCA CflUillUJU 



1B01 AA T A COfXT O ACACACAOCT OOCTTOOCAC t lXTlVIUA; l tAJUWlAVW CIOAAATTOO TACTQXWr ACTOCTTCCC T3AOQA0CAO AACAaCTOOC 
TXCTQCCCAC T9XCBCT0QA COOAAOOOTO A flC A B QA O OO AOOCAAOQAC QACTTXAAOC AXGAOOCUCA TGAOGAAOJO ACTCCTOTC 



1901 AlCAflUOAO ATCreAOCAA qOCASACAOO AAtCATOQAA 1ACAACAQ0O ACTCOOCAC C HM OOOCT f CTOCTCCAOC CTOAtnACCC TTWCMOT 
TWTOCTCTC TACACTOOTT OOglCTCTOC tWDMCCTT AICTgOflCCC IGACOTOOTO GAOOOOOGAA GAOCAaTTOO GAC1CA1O30 AACTTCTTCA 

3001 ACACQC mC CCDOQCACrO lAAGOOXOn OWB AA OOOC GAAGOCTOCA TCAACATK3T CHAILAIVAX MTCAACXXT TOQOAGATOT TTCO0CPA3A 
TCTQOQAAAO OAJUAIIUAC A7TQ0CAQCC yiVXTIUJOl CTtQOOtfXTT AOTXTOACA CACCAtACOO TCACfTOGOA A UX'IUIA CA AA0OO0CCAT 



2101 AccAoam c A T^nxa7 rnrA AA na o ocAmcrcA ctaooctcaa acacagagat cagaagocttg aogacaxaoc gctooocaca gaagcaotoc 
totoxaoq TocroxxnzA oaamcac qjtcatgact catoqcactt tctoicicm oxcttcqcac rocropwoo csAoaoonor cxtootcaoo 



2201 TA1A1DCTAA ACTXaOCtGXC AOCtOCTOCT OBAB I OO C W ACTQCTTTOT CTTCACAGCT "^r*^*Tff TOCATOOCAC ULTiTAOOT GOCTCACACT 

o oocroow e 



ft 

2301 1MXRCRJCT ACCTT SAACA ADTXX7TUIT (t T .T.nU ACAO TtOOTOOAO TCAAOQCWX ATOCATOCQC CUL1CA ATOC GOCAGACATC rZOOOAACT 
ATCCAGAOCA TOGAACTTOT TCATCCAGAA OOBCaC ICB C AACtAOQCIC ACTTCCCTCG TAOCTMDOOC G0GACXTACO OQCTCTgTAO AAOCTATTCA 

2401 TTGOQOtACC CAOOOCTCAC rQPOBICTCA 1CIAOCTCAT AGCAfllACTO OOCTACAACA O00GAAACTG TCTCACAAOC ACATCAOOCT AAOOCAGATC 
AACOOCATQC GT00QGAO1O ACOOCACACT ACAfOQAGXA TOTOATCAC OOCATCTtGT COCCTTTCAC ACACTCTTOS tCTACTOOCA fTCCgXC TAO 



2501 n a r^Vi .»C CACAC CrgIC CATAGAOTCA OCTaOCAAaO CA IA g A O O S A OOCAimrO AGAIOCGTGA AOOOOZCAAA UUULT1 1VJLA CXOCACAOTT 
QC1Q00CGTO OTCTQCACAO OTXTCTCACT OGAGQCTTOC UUlClUXt OO CT AAGAA C ICtAGOCACT TCOOCAGTIT OOOCAAAODT OACOTCTCAA 



2401 CTTOCTCTOC AAACTCAGOB GTCOCTTSAT CAGTOOTOTC QGOOCTTAGO ATCTOCTOCT OTTOCTOCAC TTTAGQOOCT OOBOIGCtiq OCtOITOClC 

KiVM^>UV> ^KMM . _. . 



2701 AGCAT CTAOO AAGGCTSTGO GCXTOGAGT OOOCTOOBTC OCAGGATTTA GCTCAGOGOQ tOCAGAGGIG TTCTO0OGTT CCAGAGOQ0T OTTOglATTO 

ocaajoctca crrrHmr, gctoctaaat ccagtooooc accxcigcac aagotxtaa oototoooca caaccataac 



2001 l lL UUU r OCTOOOOTA OXCATAGCTC CAATAATCAT CCTCXGGCAT AGTGAACACG tOOOOOOBOg TTACTGCAGO CACAAOOOOQ AGCAGTGAflT 
M C AA CB CSA GGAGOIGCAT CAGTATOSAG CTOTCAGTA OSAGAOOBXA tCA CTICTOC AOOOOaOOOC AATSACGTCC OTCTTB00OC T GPTO CTCA 

29oi QTOGocigr fTMwmrx rrAnnrncA C ocaccaoooc tctcaactca ccttogooct tccactoctc caxgtagtcg ocatagcaoc tctgataqta 

CAGTCCCACA OCICGCltSC 00X0000910 OBWOTOCOC ACACTTCACT GSAACOOCCA ACOTCACCM OTACAXCAOC IXVlUHJUtUj ACACXATCAT 
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1001 NRQCXAM2C TCOTCAOCT CACACnCTT U.IUJLJLA7Q 1AAC0CIGAC TO SMJLUJX CKULM CAC 1CTA700CAO OCAA1ATCAC OPi wtiyr 
TttflCTTTOC AOCAOXOTCA CTC1CAACAA fXAOCBCIAC niUUUA CTC XOTCOOuOO GAAOGtACTQ ACAtaCCCTC OCRAtAOTC CAAASOICOS 

■ HOI CM3CXM306 CACCRXOCA ACCTOCACTT COCCAOOBkC CCXCCAATQC OCIGOCACAC dTOOlCAOC CACACAAACC CAftXTACCA QO0C1ACIAT 

omeKxccc otqcaoooot TQCAcraA* obb mp c m o aznartxao oc mww/w cmcooro OTCTcmoa ocucoctogt cccqatqu* 

5201 CAAfcMOOOC CTCAO300TQ OOOOQCMXJ CCTCTAflOCC A OAAJLIIUJ CAAOCTOOOC QOOOAOCTTX T06AA1C1CO CTOTOCtQOC TTflAAAANGA 

crrrmccQ gmttcccoc aomoooxce obmxntcooo icooooaacc otiuacooo caocTCCA Aa mctocmr r^m^j^; acrttrct 

3)01 AOQOCTCA AGAAGACTZC C P fl tl COC T UUUU111UL ailUUllU CTCA1CCTCT COPCCAOaDC O gWODCTC CTCCfcAACAC M2CTQCAOCA 
WBICTOCT TCTTCTCMC GATOAttCA CCCAAAGACO 0GAAA3AAAC OAOtAOGACA 0000080000 CTXAOQOCAC GAflOtimg TOSAOnCOT 

3401 AAOQGTCACA TICOCAGAAC <TnWTXT» OOCMXTQO GAAACACAAA ACQCTC0CCA AGACCAAAGT CMSTHOOOTC O OB KftACAC 

TtoocwTOT aaoorcitq oocioooosT cctctogacc cnroTcm toociuaoaot tctootttca gtcatoocao tqooootoct omoroic 

3S01 OCraOCm ORaaOBMS T0GAAACAAO CATOKRTQT CMXCTCVSA QOCA0XCCOO TDULTCTOCE TOMJOCTttC TPTtTMUa O1O0OXCCA? 
CGAATCGAA? CraflllTUL MCTTOnC GBCACAACA OTQQOdCT GQOTCMX33C MOmCMOO ACTOX3UTQ AAAAATASTT enOOCTQOTA 

3601 GOraOCRQC CXGAJGA0ST OTI CACACAT TCOBl CAflC T iCAACABACA AAAO01TT08 TOOCTOCACT AOCTTOCAAC TCATIOCCAT AAOCXXJriAT 
CCAOTUAOQ GACXACXCCA CMCIUTCIX AO0CACXCGA TCTTBTCTCT TTTQCAAM9C ACQCAOCTCA T0CAAOGTTC ACTAAOQOIA TTGOOCAATA 

3701 ccxtttacto 'muraoo cnujiu.Tr ctcccatoct mtc o oo o c t togaatctoo atttttqqqo gaagaaooqd oottoooocu omctoocaa 

<2C»AATC*C AAACtKOVCC GA9QCA03AA CAOOOTAOCA TOOOOPOOOA AOCTtMOAOC TAAAAACCCC OTTCTTCCOC CCAACOCOCT CTCOACCOTT 

3801 acMCTxroao oGMorrnc rmcncic aiaaaagaac aaaocrcat ttctbooctc tocttowct ctctaaocto ogtsracao cxx*asAA<rr 
cotoaaacpc cocccaaaao aaaac*aga<j Txrnrcnc Tnocautfm aacaooboao 10 0 1* 0** cMsma&c ccACMaaic ooaocncA 

3901 MngaoTCfto AdCEKiTCT icmcrrm uiuuia g ajtiajiia t tm uurm oronaMor qtctoctcac aotiqcatct gxocaccaca 
tcacocaotc raaiva acaaagaaxx aaaaaaakxc ta aaoaa t * aamaoaaa aanm cagaosaotq •nscAamc* ca cbioo i o t 

4001 VDCKIQTCXT OTOCBOOO AOCTCACAAa MOOCXTXQi AWCOCTOOA ACTOGAOTIT TGAACfcarZA TGAOCTO000 TGT0QA7TJCT GACMSCAAA 

acgtacagaa C A Oicmc c tqactciTC tcoccaaact txtoqcacct tcacctcaaa actotcaat aciogmoqoc acxcctaosa cftcrmom 
uoi cocAoot oer cimAcuc a*o»ctctt mmsctgmo onsemcc aotoocagao cccwtccto jyoxrrrcMc taatoatto Afocrooooa 

000T0O0SA GACMTCTTC TTCATSAGM TTTOOCACTC OCtASAAAOQ tCWOOICTC OQOtMOOAC T0CSAAAGTG ATTA<X7tAAC TAQSAOQCCC 

4301 CACCAOOCTa GGCXCAOCIT CAATCACCTC ATT1X1TJTA AAAAAAAA A T OCACICATIG OOCAtACTIT CtAGACTCAC ATACTAACTC OCATRCTCT 
CR30TQQCAC COCTCTQGAA OTtNOOSMS TAAATfcAAXT UHl'l'll'lA OCTGACCJUC OOOTATCAAA GUCtGAGRS XASCATXCAC OCTAAACACA 

4301 x m>cwr acicAcxooc cwcAorocc Auurmuju ocxaatxcca aocactooca cacrcveaa mxxxuxuf iTiciui ' icr crxxraoo 

tAITICnCA CCACTCAOOC OSC1CAOOS T0CAAHAOX OCTmAOOT T0CTGACO0T OICAACACIT COC30CAOOCA JkAACACAAGK CATCKOIGXC 

4401 qnr ^ n o cT a: ctttoototc tcttctctat acAOoacMsr actctowmo qcaaaatcaa aocbuuot iuliojcia caoaoooqic aaoocxaaot 

CQCTOOCACC CAAAOCAOfl ACAAOCMA OCTOOOOCCA TCACACTCOC CUUHAClf TCRCATTtAA AATOKOQOAT OTCTOCOCAC TT0Q6ATTCA 
4$01 OBAAACOOO C A HAAA O OOC TTWACfcATC TOAL'lUUUA RCTTEMOC ATQCOQAO00. QAOOTQCAXA CAtOIMOCA OCnqCTTOC ACAtTTTOOO 

ocmoacco taatttoooo aaattcttac actteaooct aacaaattqo poo o ct o o c ctocaocxat gocatgoct ccaac c aaos TG1AAAACCC 

«601 WK TWK O *flC0CT »OCA AATOCAACA C AOCTCZTIAC JHOOOCTTtCT ACMKATCXT AAOOOQAGAC TQ90CACM22 JWOOCCAOCC 

cicoacroac iujulatoct thocttctq togacaaxto icoocaansa totcptacaa ocTCitaorog Ttooocrcro A cooactcc tocooctooo 
4701 Aarorosaoc ctqoctocac locnooDa ujunaajuc wooionooa opocaooooo tcaaaccoc aooooooooo touuogcit cmrcsociq 

TQCACAO0CS OOCCAflCfC TOOTOT fTT^ H Tnma ACOCAOOPOC C00CT0000C ACrXTOQAXC TOPaOOOOOC AOrmOGAA CtQAQAOQAC 

4801 CT CAGAOOCO TOOTTOCTOT 1CA O C MC 1T 1 00 X 00000 TOCTDWOOT OCAOCAflCOC TTTOTICOOO '* ""VV CTCTACDCTC CO00BIC T 0O 
CATOTOOOC ACCAAOSACA A CTCCP CAA TO3M0CXSAC AOCAATCZAA CC t OOTOOOO MACAAQOOC COTQQD00CA OM2A7QCX2AO O000CAOACC 

4501 tocmcRC tcrcsoccrr cxsooocnc ctxAaroocr oA vnjuaauA octoooctoc tocttctoct icxacactiq tkxxeaoca ccitb ccob 

AOOtACGAAO A fiAfiWOGAA CTMCQQCAA0 GATTCM90CA CTCAOOOOCT CCACTOAQQ AOa^AWl^ AGAtOTGAAC AlUUA/rU/f GGAAAtQOCC 
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1 OCAOCTOOOC AAACOTTOOC C ATOOOOBT C CAAAOTAIAT AOGOOOTBOT HVOCAGAAOC TSAQAACYTT TVTXQUkAQ CCC3GCTOCT AAQCCCAJCC 
CGTCCACCXC mCMfiQB CTM3Z20CAC GTTTCATAIA TOOOCOCCA A3COTCRCS ACtCJTCAAA XTCX3QCTTTC qOOCCAOOC* ROOOCTTCQ 

ioi taoocMcr* T^ f J^fQ^^A , aacaaaaaaa aaaticcag* gmgcttcca gmcctcctc cicitoocic ttccticaaa aaaciocaa oxooqcmzxc 

ATCGCTTCAT OOOCTTCTTT WCTWTTn T2TAACGTCT CTTCCAA07T CTQQSAOGAC GAGAA0OGAC AAOGAACnT TOCTCACGTT CMQ08ICMS 

201 ACCCTCCMX CWCWC W TA UUULVllX AA taAlttOCUi ACOCTOCCTC OUUCBXIUJ C C AAOTtJUC 00000 0000 T0U7DUUOC QICA CBOQC T 
1O0GAO9TO9 UllUUHUA ATGOOOONK TtOOQQOCAC tOOGACOGAC OOOCTnTCHTS OCTTOCATTC 0000CTO0OC ACDCAXTtCG OCXQOQBCA 

ioi ccaatcctcc ctctcacocc ooocaoocac A uin r wr cuLTumjQC arococcoc tgaodtock ocftcartcflt trtqsaaog chuuaijac 
ccmocAoc ao dCBoc cocciaooro t coooooioo oqgaqooox coaaooooaa AcxacAooce cotccMGAr aaaaccttoc oacrooocro 

401 onactMaQ Tgrrawo oofoocmq T omooc i c t oocooocao CTTOM30CM tcnoooncc cncctKtn o qggpa m o cicocnoc 

CAACGAXTCC CTCCOCCOTC OGACCGAAAC ACtAACCCAC JtfPBOOBOTC CAAATCCQTT A0XCOCAAOC CAAOGA TA AA CA1VIUJLAT CCAOQGAAOQ 
SOI rrOCnTTW TOCT T CWCC CCtOCWOa O OTCTCCAACA OJSHOOCB^O CArTCrxOTC OXAT000GA CTCOTTGTGA COCCXTOOTC tOOOflBJXCT 

eaogaaaaac vrfrrjwn orjCTurm cmmuztct octctccatc ctmcuc m oacnacoct caocaacaot goaypcoo aoqcicciga 

soi tctctoga cc togxcicno tcahaoci* GM3acrma ocxcAcrorr aomxtctx agooogaact caaooccica tocrctcao ocacacatat 
acacagctoo MCMggMf AonnccAT ctoccaaaac aantxxx iuxbgagat tcflcccncA citocogagt aogaagaotc COIO T Ctm 

?ai Actmcrccr gagciciaga cactcaotoc toccaogto ttcaaacact agatcacct* aorncocAO aogcagccac uiujiuita aaagotctcc 

TCCACGAGGA CtCCAGATCT CTCAGTCAOO AAOGCTGCAC AAGTtTOTCA TCTACTCGA* CCCATOCCTC TUXTCUA/Ii: CAGCAGAGAT TTTCCAQACO 

eoi cttccctxm txcggacgct ctcattcccc agcgaxtcag uxtiujlic c cc A Tijaxc cncAOTAOT tmuooctctk ggwiccact Tocaaoaao 
gaagggaaic aagggtogca gacxamsccb tcccxaagtc aaoMaaaM omuw oatuigatca aticooagkt ccxaagotsa wnn.na 

$oi ooooocoooc ggccgtgaic cacoctictt ooogacgcao atcctatctc MoacKw ee ctqcaagaca ctctcagaca ttctopcict cacitxtctc 

CC0C00C0CC CCCGCACCAC CVOCGAAGAA COCCTOCOTC TUnCATW^O T00QQTMZ3O GACOTTCTOT CACACTCTCT AAGACOCACA OTGAAAACAO 

1001 TG0C&7CM TTCACTCAAA CCT0TOO1C TCACTOGGAA GAGAOCACA C7COGAACGG ATOCTCTCAA ClC m OOOC WI UJUUUA CAgCgn OCA 
ACGGAIAGTC AAGTCACTIT GGACACTCAO AGIGACCCTT CTCIOICICT ONB0CH0CC TAQCAOAOTf OCAATGOOO CCM30QQQTT OT BOCMflCT 

uoi Aciaz^TCT onaocroong gaodoctcat oataxoaooQ OTOTornoT gtgtosctoo asmogmooc noo ccMm ccictcocxc tocctocctc 

TOkCOCTW OOOOOCOCC CTOZZM7XX OOICACOCCC CACAOUMCA CMCACTCACC TCTCCTtOOO XW00ATOC OCAOJC O PA fl 

1301 TCTOOT0008 OTTOQOQQOr TTTC3QCTX7W t tftmU T U ' lU AA3C0CTQT0 OCTC3CXTOOC aaSAOTTTOT OCOOOTTC TCTfOCMQCT IXIUI UXAC 
pyf^ f f>ff i* €htCtX?CCth AMC0CMKT AOOCMCMC TS^CWCMC CSMXRM3Q0 OOCTCILMCA GTOOfCCJUUO ACAOCflOOCA OGMM300TO 

1301 CCJgODOOOCC MOCCXKASA OTCACCMCC 0Q0O7IX7ICA T1CACCAC0C QCTOCMCCC TOCAACETXT CTCCtWTr^A C^MXI MTM UOOMUCA 
O &roimXO TCTOCMTCT CACTQCTOO OOCXXMCACT AM7TaOT0a0 CCKOCTOOC ACCTXOSAAA UUUULIUJf CTTCCTOCTC CArCTTCQOT 

1401 GTTGXAOA ATCICtCAST AMXMCTQCO TCACOC7TGT* OTOGMO0QT tWWrmm OCTTTTTOCC TOTCNCMCMC MCH KO&C WULltACOC 
CAACnglCT TNSMGNRAA TTQGTCACQC AOTOOCM^T CMCITO0CA ODOCMCAC OgftAftAACPB ACACTOTOIU TOTAflOTOTO CQOCM3TOQQ 

1501 TOTQCTCMCT CMCMXX7XOO OTC1CTCWA t C I C T CWOa 000WTCT0T 0T000TOOCT TTOmOTOT OTCTMOOOCr OTOTOICWT OTC I QCCCC 
MaOGJkGJGA OTCTCOCJ flC OaOflUT AOOACMCC 000000 CAOOCAOOCA AACMACACA CACXTOOOQA CACAOCXTA aCM7rQ0a0 

1601 GIMQCMSTaC OOOOCTCTOS GQCMATQCC COOCTCCTTC 0TOCCXACO3 TCACOOCAAT CACMOCAOC CAOCXXdTC AOTOOCTOOT QOWX&CC 

omnauso cBoecAcwe coemn coo ooaagaum CA paonqcc aotqocotea oiotiooico otccwcaao tcackmoca oottooctoc 

1701 CTCATCTCTT CC A TOOOOC A OTCOCAaOOO CAGCGACIQS OCTOOC MTT" TOAOCTOTT C*COCTOTC ACATOOCAOg AAOCKOCTKC TCAAO00CTC 
GA9XACACAA OCTM3COQ0T CAOOOtCCOC CT00CTCAOC 0CMXX7XODS AOBnSACAA CTOOQAASKC TGTACOGTCC TtGCICSATO A0ITO0QOIC 

1001 OOCTOSOTOC COCAOCACT OaOOOOOCAA O0OQAAO1QO tCQOCCTTCA AOCAOCACAA CCMCACTOG AOCfOTOICT UJXLIILLM5 C T*r.*ir T Ml 
CQOACTOCO GATQTCOtOA Wf/TV/n OOOCTTOOC AOQCOBAM7T TMlWlVVt 03tO7VOCC tOGMCACAGA COOaCAO&lC UAUIUUIVT 

1901 QOCTACAACX QCPDBIfflkAfl ACACACTAAO tXTCAOXXTT CN3CAOTT0G GA30CA0CAC QCTNXTNX1 GAXOTQOXT CAQTZTO9C AOTQCCTICC 
COQATCTTCT 0000CTC1TC TCTGTCATTC KOCTCCOCK (7ICCTCAAOC CXAOCTCCTC OGATOGATOC CXMCAO00SA OTCAAACKTO TCACQSAAOC 

2001 TQOCATOCAT CAACKXCCCT M2CACACCKT AAOXAOCAO WQTPqOCA GWCCTGTAAC COCAOCTCTC ASAAaTTOGA OCCAO S A flB j OCA0OAO1 TC 

AaxnxcciA cttcgaoosa tcototoota T Paocxccic aocmxaost ctogwwto oosrasACAa tcttoocct ccoiccxaci coicctcaao ' 

2101 QAOOCCAOCC TOTOCtACTT A1OSAOT0CA OQCTQCACTO CAASACATCA fDttWICA A AAOTTOOCCT T30Q00CAO0 TOOPT OOOS AA01MOCA 
CtOOOOTOOn ACACQAXGAA BCCTCAOOT COGACOTGMC OTTCTCTAOT AAOAAAfiTT TfCAACCOCA AOOOOBCTCe AG0CACT0OC fTCARCTCT 

1201 AAOXGACAflT AAITtTOJCA CTTXKOaTT OCAOOTtOCT CtGACOCCIC AAOTCXOAAfl QAACTTPCC ATTCTOGOCA CT CADCAOT A UAA/lXAilA 
TTCACTCnCA THUUUCACT GAATZXTCAA OCTOCAAOCA QACTCOOCAC TTCAGACTTC CTTUAAATOO tMCACOOOT CACTOCTCAT OCOCAATAAT 

2301 HlOOAilTU AO0A0CAAO0 AACTtTtCTT A UJUC1U AJX GAOODUOCOC CAGATCICAT WIUJ1ATC TCnSACZCM CTOOOOCAO AAOAACA^CA 
AAAOCCCAAO T0C T 0C TT 0C TXCAAAACAA tOOCOCOVT CTOCATQCOO CTCBCMTBt COM2OU CT M0 AQACTUAOZC GAATOHJTC TTCTXCtTCT 

2401 AAAO0CAAO0 OTTGQCM3W3 MXXBUCXk GCtQCCTOCA OCXMOTQCA O5AA0COT00 GAOOSAOClQ ACACASOCAC TtCAOOCOCT AAOQAO OO T 
TWCOCnOC CAAOOOTCXC TOOCCTTQTT t^W-^^ r CCATSCACOT OCTTOOCAOC CTCCOCgAC TOTCWCCTO AAOTO000CA TSCCXCCfCA 

2soi ctoooagtOT ctTsioooco TryrrrAgyr Acxczaocrr cncwccoe OTmaac iotocctoto tgctxaaosa o gaaa oxo: tctpoosaa 

GAOCOCCACA OAACIOCOCC A0CAO0CT00 TCACACOOAA CAACAAOOOO OCAAACA0TO AOOOBACAC AOOATTTOCT OCRtOOGDO ACAATOCCTT 

2601 CAOOOOTOD TXOCOCVGA TOGAOTOQCT CCATXTOCAT 0CTCNQAO0C ATOCOCACTT ACTTTOCACT CTJ03CCACT TECCCTGAAT ATC TODCCA C 
OfOOOO ttlC AMOCCACr ACCICMCCA OCIXTXOCTX CCACTCTOOC TA0QQ7TGAA TCAAM3CTCA CAAO309TQA AAOCCACTTA TACAOaOOXO 

3701 AIOTGACCCr 0L1UAW1C TCXCAGOCtA AOOCACAAC CTACAOCAOC WUOTCTCXC ACCTTCTTTT CTTCACSAAA TMTAA1CCA TTTPQCC1TC 
TACA0TO3CA OBACOBAAAC ACACXOOCAT IC CIWriC CA TC 1 CCT 0C AZtAACACAO TOCAACAAAA GMOIGART ATttTOOOT AA A ACCCAAO 

2eoi croccxocAT mnnia icaocxooos axcuocict ootaoitcac oultultujc ccaacttcat aodctcaact tt caopoct t ooct gacat o 

GACOCAOCTA AAAAAAAAOQ ACTOCACCCC TM2ATOGACA GCAICAAOTC OO a A Og AO OO OQTTCAACTA TCOCACTICA AACTCOOGAA GCCACTCTAC 

3901 CCATCATCCT GACTOOCTCT CCCTQCAAAC TXTTTTCTOC TAAOTCAATT tt-TMILlU: TACTTCAOCT ATC1ACAGTY CTOOOCAAC T tGACCTOCTQ 
COTACTAOQA CBCACOCACA CflBACCTlTB ATAAAACAOO AT1CACTXAA CCAACACAOO AX0AAOT0SA TM3ATOTCAR CAOOQCRCA ACTCCACCAC 

3001 OOBOOCAtCA MOOOCACITC ITTCTCTCT1 TTTXACCXCA QZOCAACOCC CCACACACAA AACTTCATOC CXO0CCCIT0 AAACCAOO CT OCCTCTCTJA 
C OOBOOTOB T TOOOCTSAAfl AAAflACACAA AAAAXOBACT O CBTTOOBQ OCB01CT01T TTGAACXACO 6AO0O0aAAC TTXQQTOCCA CSCACAGACT 

3101 CTOOOOTCO OCAOCCXQAA OSACATOOOT AJCACAACCT CATTAAAAAC AACACATMfl OOTUOCOC TOC TCAACA AACTCTCTO 1111 IL Ml 1 
CAO0O0CABC CCTCCQACTr CCTCTMOCCA TTOTCTTOCA GTAATRTTC TTDTOTKRC OTAATOGATO ACTGM7TTGT TICACATCAC AAAAACAAAA 

3201 nOCICtCM AAAATUTTT OOfTICTm tTtATtATXT OC T BCTOm GK7RSACTOC tOCIQCMKA CACCACACAT AOGAffTT Og MQCAAATTr 
AACCACAOTf TTRAAtAAA OCAAACAAAT AAAXAATAAA OCAAXACAAA CTCACICACG AOCACGTOBT OXC0TGTOTA TGCTOCAGTC tCCCTXTAAA 

3301 TCATACITTO VIVll.mL 1T UXm/nillU gUHJCllLAT OOCAATCTOC TIC*CTOflT GACCtACAAT <I,UU VVLl OOOCTTTAAO QPCACCT CT 
M7EATCAAAC AXfiAfJMTVA ITTTT^WnT CCACGAACCA CCQTDCAOO AM7TGM7TCA CTCCATDTTA nmrr.hH^ COOOAAATTC OBTCTCktCA 

3401 ccrritoncA oaaaswacT i iaauxc tctcaamttt cmmtacaa aioticacca irATATTAtw crtocAonc ttaocTATCA otdacoxcca 

OSAATCKTf7r OODOCTOOCA. WTMXXQG AGAOTTTCAA CTCWATOTT OCAMROCT MTOTOOTOC GAACCTCAAC AAOOCATACT CACTOCAa3T 

3501 CTCCTOOCtA OCTTCRCCC AACCATCTTT 1M2TCTSAXO OOSAAACOBA OOOCCAOBA OCATOOTCXX OCAOCATTTC CTCT POOOO Aawnwi 1 
CAOCAOOCAT cm »> fi MO nC TTOBtACAAA AXCACACHC COCTTOOCt COOTOCTCA T O7XA0CAGAT OOTCCTAAAC CACAATOCCC T O CP aaOO A 
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3(01 cacttgocac ocAcu mu.' agcccgctcu atcaocaoca acaatctato aotctocgct T onxa^m aaoctactct uium/iuu: 

gtcaaccctc ccrccxooc t o g g g oc acc i>gtcctcct icnjcxnc T ocwx ocA aoooooccac TTCCA3GACA ^nf^OTO jyg wuxg r 



3701 ;i*ii IUL «*vmu» atgacctoqc octqctqqa tocatboga aactcaicao cttgmcms maacooca gctgcacto gagxtogccc 

AAGAGCAAAC MMKMU TJCTOACCC Q3ACCACOCT AOODUTOCT TICACEACIC CAACTTCTCC nTHXCTT oSSSflC & " 0 ™ 3OT 



3B01 mi UiW ACACAAOSAA COCCIOCACT TO.iU.1UUT OOOOCACAAA CCBGBCTBCA AGATOCCCtA COAACAOQQQ CCQ GOm A Q rmrnryy* 
TOCACliiTlT TCTCT F CC TT GCGCACCTCA AACAGCAOCA CtUAJlUiU OCCCCGACGT tCIXOOOCAT OCrilL l UXX: G GOCCCCC TC rSvv^^ 

3*01 QSACC TCACA CATT TOOO CCTCAACATC CGCT XAOSAA CAOQaC TieO UtlUUL l UL T OTTM ULILT w^r>*#rng ^tVLVL TT r'-PrfgnritfT' 
OCTCCACTCT CTAAACGCXC CCACTtOTG OGCATTCCTT C1U.UU LAC C fttACCGACGA HAMUJJA OJIW1VUJU ' v "*'-r3 rftA QOIUIU/IUU 

4001 CCACACCC A C COXC AACCT QMJUULTA.T CTCTTTACAC ACACTCAACT TCAA0TCCTC GGCGACCCCT TOCCCPTICT flmrnm i „..., 

OCTCnCGTQ OOGCCTICGA CTGCCCAAGA GAGAAATCTO TGTCACTTCA AOtTCAOSAC CCOCtGCCC A AOQQQCAACA ATCXXXAACC AtoSCSS 

4101 ottttcttxt occtoct oagctctccg qaiugma ottt^acqc Af TA f ro acA oocAOCAqcc otocgacoco cigaacxqqc mmm 

CGAAAGAGCA GTOGACGCGC CtOCAGAOQC qCMO C OOC C OOOOglTODQ TOgTCQOOG T COCTO0TO00 rftmimr aCTOMM flSSaSSS 
4201 TGCTCTOTAA ACTC1TXACA CAAACAAAAC AAACAAACCC OTAAfKAACA AffTAfiCAOCA ACA1CAGGAO OACAOOOCAfl CAAQCAOICC rararoiCT 
ACSACACATT TCACAAA3CT O UIOUUU tl«.AlUUJ OiniJL T W T MC ' IULWL T lLlAllU/H raiUOJUC CTTOGTCAGO £^g£g 

° W !2SS!25 TF , 2? Crr CTOTCTC * CC * " «CO C C TCTOCCA1CQ <»CATGAC00 AAOCAOCTCC UIU1UH1T O WCW.1t.lC ILlULmm 
CACAOCTQQC AAACTCACAA CACACACTCC TGCAC0GGQ9 ACACGGTAOC I I l/ HUJI U UL: mmc»m Hinr i m rw-v^^ 

4401 ^ i U ! > ! UA " rrjtfWXTKA CAGC1GGTCA Cm OOOSAC AOO00GTO OO O OOT ATCA ACACOCCTCC TOCADOCTT WIBHOTU CTTCAACCCA 
ACACOOOOCC CCIVWUUCT C1GCAOCACT CAAAOOCCTO TO00CAC0C COCOOCTACT TGTOOOCA GC ACGTATAGAA ACAGGACAAT CAAOTtSr 

4501 acttciggqo atacatcgct cactoggtoo onoaarraaa otocaacqqc caccttooc otctocgtq aggciggagc cgaaagaoic ctogtcigo 

T C AA C A C O CC TA1CXAC0CA CTCACCCACC CA3O0CAGCC OOOTfOOOQ OT0SAAMDOQ CAGAATOCAC TOOCAOCTpC CCTTTCTCAC GACTOCACC 

4601 552222? aETT ^?° T OOCWOCA T OCAflC TOCA CACACACOCA ACCAOGAAAT GACAOTAPPG WKJUJL I T * ' ■ 'AHJl ' A00CACOCAT 
0CAO0TCCCA O0CAACTCCA OCTCCACCCT ACCT0CAQGT CTCTCTOOOT llXiLCiriX C T OWOTCAX AOBATAqSAA »*MH7mm llAAVlUAVtA 

4701 CCACCC TCAA OOCTOCAOOq I CAO^ACAT ACCTCTOTTT TUClUJLlUi OCOCTtAOCT GATTAACTtA ACATTTOCAA GASGTDSCAA U.1UL^1X.TU 
OGTOQCAOTT OQCAanODC ACTOCTTTCEA tCGAGACAAA AOCAOCCAOC OQQGAATOGA CXAATTOAAT TTOJUOTTT CTOCAATCTT OQAOGAflGAC 

4601 «OCAAT TCA OOOOOCGACT GA03 C AAOTC OATOOCCCT TXOOCA0XCT OCTAAOQ0CA CtT COOOC ro AXTOCAAAAT GTCAACCQCT ATCTCACTOC 
CiacmACT COaoaaCTCA CtGOCTtCM CIAOOOOOGA AAOOCTCACA OGATTOOQGT CAAOqOOCAC TAAGGTXTIX OCTTOOOaA 13UQACXGAC6 

4S01 tgglCITTC OCTOCTQOBA AAACTOOCTC AOSTTOQATT fTITlOClCO TCPQOCACAO AOOeCOCTCC CAACTCMX3C O0OCTCCCAC GOCratOOUl 
AGTCAGAAAC OSAOGACDCT TTIWCCCAO TOCAACCEAA AAAAAOSAOC AGACGATOC TOOOOQCAOG GTIGAOXQ00 O0COMXA7TO OOGAOOOIC 

*° M T^T?^ TCT ^' 1VR; *cdctcaooc ccaooccagc oxxc n u a c co njLian T aaoccmcr wrrrmuuL mcuxxxxk octoccacoc 

ATAAXACCAT ACAGGGAGAQ TOOCAGTOG GCtGGCGTCC OCO0GAACCC 0CAOGM3CAA COOGGAATGA OCAAAACOOQ idLlULUJU; OCAOGCTOCO 
$101 CCAILT1U.T OCAUJJJ1TP AOCTOIGAA TQM7nX7XQJ QATXQCTOOQ OGOOQOGGAT OGGATTGACC TOC M AAJL ' IV CAAAACXTTT CCTQGCtgIC 

tt3P>CMCC ^ OClCOaiAAA TATCACACIT AC1CACCACC CTAACGAOCC OOQCCOCCVi CtXTAACTGG OO0TCGGGAO G71TXGAAAA GGAOCCOGAO 
S301 CQCXtCriOC ACTrOCTrOC IULLI UJULT TCAOOOSAO TrAGACTCCA AAOGASGM3C ACCM3ZATC COOG TOOeCT TC T TGCTC AO GOOOCAOACr 

QnGMOMGC TGAAOGAAOO AflQCAOGGGA ACTGTOOCIC AATCTCAQCT TICCXAC1GO TOCltAXT t A O QGOCM00GA ACAAOGAOIC O000GICTCA 
5301 TTTTCTCTXT AAOTUTPtOO CLT1 1AAAA0 C CTAGGACGC CAACT1CTCC OCACCCTGGG AOOCCEOCAT OCTCTCACAO ACGTOCAfld^ AATTTTCAGA 

AAAACAGAAA TTCACGAAGC GGAAGGGGTC CCATOCTOCD GTtCAAGAOO GGTCCCAOOC TQQQQO0GTA GGACAGT0TC T0CAGCTC0C TTAAAAOICT 

5401 SISniS fTiWiaX C1MOCT0CA WTTICAATC OOCAAATAOT TTTXOOACTA GGAVCTIAA GAO0Q0QCTQ ACTOXJCACT 

1 1 iv*AAAAOT COC CACTOCO A AACOCAOO O GAXAOGAOCT AHUUCTCAS O08I I iATCA AAAAOCTGAT COTAIGAATT f i ^iAlW TCANOQGTGA 

5501 JJSSSISrSS TXCAGTOOCA AAGACOAOT T CWICOCITC CCILLAUL1T TCADCTOCTO AGAATGCCAC GACTCACATT fCTWmCT 

tAflOGTOACO TAOCTTAAGG AAGTCAGGGT TTCTGCTCAA CACAO0CAA C OGAO0ICCAA AOIGGAGCAC TCITAGGCTS C1CAOTCTAA ACKEAAAACA 

5401 AAXATTOOQO AGATOOOOPC TA UJUUUJUf CCCCCCTBCT GCATOGAACA TICCATAOCC TOWCT0OOC CCCAGGTIGC AAAOCTAATC CCAAACCQCA 
TIATAACOCC TC PO0 0O 0 G ATOQOOOOCA OOQQOCACGA CGTMDCTTUT AAOGtSOOGS JCAGGAC00G GGAT0CAAGG TRGGATXM3 GCTTTOOOGT 

5701 525522?* UAAUJXTr tCCTQGTICC CAfcAAAPCA C TWCAICTXt TATOTMAAA 1AAAOJPVTT ATATATOAGT UlUUimit/f GTQOGT?nac 
OaaOlCCAT W A TA flOCAA AOGMXAAGG OTOTOOIO AAXAOGAXA ASOWEATTT ATTTMXAXAA XAXA1ACTCA CACOCACACA CACGCAOdl 

5M1 SSSSSST 225 T 5E? rOC CAO=TTOCrT OTTTTCAACT GTGCTGTQGA CTTCAAAAIC O CT1 1.1UAAJ ATTXCAOTCA CACmcroO LlUiUJtlTr 
CAOBCWOCfc CaCAOBCAflG CtCGAAGGAA CAAAAGTTCA CACGJCAOCT CAAGTTmC GGAAGACOGC TAAACICAGT CSGAAAGACC >y^?rgjJL 

5901 TTOTCACTTT muilUlIU lUlUAALmC taOOOPt T aSAGAOGTC COOOOCICTC OCTmroCT TTCTCAAGTC t O I CTOOCTC agaocacttc 
AACAOXOAAA AAACAAC A AC B G A flOCG A O O AGACOOACAA CRCKRCM3 GGOCQGAGAQ OGAAATAGGA AMQAGTtCAC ACACAOGGAO yT^ ^ i ftAfi 

MW S25!5Si TOCrcTOT «"AMWU TCTOmMIT CTOGAITOr GGOOGACATQ CAATTTnCT TCT9TAAGTA AOTTOACTO 

0ITOOC W ACTCACAC» OA Q qpeACA C ACACAATXAA CACC0AACA QOOOCTGTAC OTtAAAATCA MGACAZTCAT tCACACTGAC 

5101 Uwwwiy ill I'l ACAAT CDCXATOGIT CACAAT 7CTC GGTQGAAATO TCTQA3GM3Q ACAAOGGGCT OCCACTOOCO AflOOUtTIC ARGACTOCA 
OCAOCMCtA AAAAATGTtA GA H i»OCAA C1CTAAGAC OCAULUiAC AGACCAGTOC 1C T 1UJJUU A G0C7XGAC0QC tanonXAfl OACRMBOT 

C201 ^555!^ TriorcAirr rmamr omrnror atttiogaoc tgaooooqqo q o wcwoaa caoicowca ctoggcaoct 

ATOOOCAGTO GOXCOBACAT AAAC AC tAAA AAAAGTAAAA CAAAAAAACA TAAAACOTQG ACIGCGCCQC COCGAC00C OTCAGATAGT GACQCOTGGA 

6301 °™»nrr og ei c t oBC caaj aaaaac cttwaaaaa actgeatoct tcaogtcaaa oioicwm roxtooACA tcwciacat 

OS AACCAACA O7TCACA00C GTEATTXTXC GAAAATTTTT TOACAtMGGA A0TCGAOTTT CACWCAAA AGGGA0C1GT AGAIGATQfXA 

64W SSISFTT C * CMAAAOC CACTITOAT TOC TAOOGAA Q1LTIUL T UI CACTTACTOO GACOOCXAAC GAATCAGAAC CaCAAOSGG ACIAAAACGA 
CCGAACGAAA GiUlinWJ CtCAAACCEA ACGATQXTT CAGAMXACC CTOUWCAOC CTQ00GATIC CTfAOTCTJO GATOTOOOC TCATTTTOCT 

5501 *C TOCACAC T lU-MUJUT TOOCATCTTC OCACCCT GGG CCACCXACTT GAAAAAATAA GQGCTTrMA AGTCtAAGGT ACCAAATITO GtCAAOOGTC 
TOOCTCTO AOGATOCAAA ACGCEACAAC Cl J l V UiACP C OTtCGATGAA CTXTTTWTT CLtHJLHTI TCACATXCCA 1QCTTCAAAC CACTICCGAO 

6601 ESSES**? «ATO»0 AAAACAAt TT ATTCAOCTT8 OOIUXDCAAT GAACmCAO GAACAGTXAA OOOCAAOOQT OTAAAAOCTO GOCACAACTT 

MxcrcnaA aot>ciaocc ttttcteaaa taactocaac gcacacgita cttgaaaotc ottotcaatt ccgottooca cattttcgac oooronGAA 

6701 S2£??55!5 ? CArn OCA CCT G GA G Qrt A AGGCCATCAA CTOOIQCACT TCAOIGTCAT OTOCATOTA GATAOCAAGC OCAAACATCr O C MD On S A 
CATTTACGAT COIAAACTCT OCACCTC0QT TOOCCTAOTP GACCACCtCA AOSCACA0XA CACCfAOCAT CTATBCTIO0 COTTICtAfiA CGAPCOCCT 

W ° X Tlfvrfcn °0 CAOCCAGAAC VIIWIUVIU AGQOTAGTGG AOOGCAA0XQ GACAC7ZGAGA OTTAOCCTCA 00GACATICT ACAOGCAASG 

"OCOOAACC ATGTQGTOCC CTOGGICTTC AAAGCACGAC tCQGAlCACC T0COSTICAC CTCTCACTCT CAATCOGAGT CQCTCtAACA TCT0CGTOC 

6501 iSSIS2!T OCfc0BC tCC Cm CAAAOC ACTACACAOC CGCACCAGGT TRCAOCAGA GAAGGTTACA GTEAOOTOOT CtCTTCTAflC CCAK3XACO 
WO0TOC * A O^IUX AQC CAAACTTTCG TGATCTCT0G GGGtORGCA AAACTCGTUT CTTOAATCT CAAICCAOCA GAfiAAGAJOQ Q0XAaX3TCC 

7001 22222* T " =MCMO ° ATOCAGAATC GAAAflCACA C CAGAACAAO G ATCCAACAGG CA3CGAOCAG GCACAACACA TTTCtCTtCT 

"CTOCTOCT OCCACTCCCA AACTtCTTOC TAflCTCTTAC CTTTCOTCIC ClCnClTCC TACCTTCTCC OEACCXOCTC OglCHCtOT AAAOACAACA 

7101 SSSSi °? CT0SAMO S?^ 805 t q ° aa » CA10CICAGC A0TOOBGTO TCSV3GGGGT TCTTOGAAAA GAGAAGGCAT 
AATTATOPTT OOQAOC fTTC CTXTTGAAOO ACCTOTCCT CTACCAE7TGO TCAOXCAflC AGATUXCCA AGAACCTTTT CTCTTOTtA 



F/G. 



SUBSTITUTE SHEET (RULE 26) 



WO 01/18225 



10/18 



PCT/US99/30078 



7301 CTOOOTTCCC OCATTCTCOCT T llU ' IVJIlA ULT i mt. T R: CKRAM7TGT OTCTCTGAAC 00000 ICO TOtOGACTCC TTQTGAGACQ AGCRCWO 
CAG CG AAOOO OOXAAGACCQ JOJOOOr C O UQCMO GXAARCAGA CACAGACRC 0007000*00 AOTOCTGAOQ AACACTCT0C VOJoStc 

7301 cwcwna taaamoc aattoocioo waooiqete iluaa ctoo crocrrwx tcaaaacacc aoooceaaao orotGAtCAC juctcooqc 

GAOCIGAAGT MSTOTOT RA A O0C A CC I COOOCPO AGAOOTGACC GAGTOUttQQ A UU11UW T000BATTTC QOCBtflQ RGAGAXMQO 

7401 22ITiST2E T0CXAO0C>q »cacagcact cacgcggao t caacaaat ga acaaaaaxga cuuojuaoc axgggtcar* aaaiacaiaa aacaqcacat 
ogtaatcacc AacTtaooic tctctqctga ctoooccxcx crctract lUinu cT GkTiimoa tacocactaa m%\uuay MtiuumjA 

7S0i gackkaiga »a«coit T Mnsva a cagcaagatc ctacaatrt ogacaciaat txaaauxat ctztcagato ckxtsooigo auytfn 

CTCACCTACT OOTTEMXAA ATTCCTCTCT OtUiUllAU GATCTXAAAA OCTCTCATtA MTTEWWIA GAAACTCTAC GTAAACCAQC CTTttioSc 
7601 C CA C GAAAAA AA0TGTAAAT ATSAAGACAO M BMQQg W OCR OtCACA CDRAACT0C 000CTOTO3 CmtOlia AGAA3Q1CAA 

ocTccrrrn tiacwm tacrgtctc mimcic ratocccac ocAAaxcrcr ooattcaoo aaoGACGAoc oaaaacatot tcragactt 

7701 TIOCAO0GAC CAAAAT0GCA TACA1AC7CC OBXPC MA O GIOGAARCA ACCACTCTOT 0GCIAAACAC CTACAO0RT GAA0GCTOCA CgCOflACCA 
AACGTOCCTC OTRTAGCCT ATC1A1GACG 0CG0GCRTC CACCTXAACT TOOTCACACA 0CGATRGTC 0AT0TCCAAA CTI003ACOT aaaOTCTOOT 

7801 CTO AOGAT GA W IU I ff ^A ACGAGC PTT RCAGRAGT ttAMMO 0GACAXACEA CEACRRTA CACR3HG0T CARMTR0T 031*900* 
GACT0CTAOT AflOOOOXTT TCCICGAIAA AACTCAATCff AOTOTIOC GCTCXATGAT GATGAAAAAT CtGAAEAOCA OtAA3AAACA CCA«TOA 

7901 ACATAATEAA TTECAA3GOT TTOGAAOTf RTRTCACT WWCIWW AACATOT0TT TCCTGAOTAA AOTBTTOOOT GAAtGACICT ACEAACEAAA 
TCEATTAATT AAACTTACCA AAGCTSOTAA AAAAAA0EGA AAAACAACAC RGTACACAA AOSAOTCATT TCACAAOQCA CRAC1QAGA tGATSZUTT 

8001 AA0TAACTAO} CTTCARTGC A TAOC00CR GEATTTTOOQ AAOCACCOCC TAAAOJOCCT CfR.HXX.1AA CTAAAA0CAO AATTTTTraC AAAOTCAAAA 
TTCATTOTC GAAOTAAACQ TATOGCGCAA CG TA AAA O CC RCOTC0000 ATXTOkCOGA CACA00CATT CATTTTCatC RAAAAAACC TTICACTtTt 

8101 07CAOTTTEA RTTEGTRG RTOTROCT ICRTQTRT TAAT0CAAAA ACRCZCACO C00C0C ARG OHUKAOAAT TOCACATTTT CXOCAAOOQA 
CACIGAAAAT AAAAACAAA C AAACAAA C GA ACAAACAAAA ATXACCTRT TGAAfiAGTGC OQOQQCTAA0 GA3CCTCRA AflCTCEAAAA GAGGRCOCT 

8301 GAAOCAAGA C TROOZAOOO TCTOCOOCA OO000000CA CA00CACACC WCCOnOCT TEMNSAACT GCAAOEAT9T AOOOAAICEA C1CA0ECCCT 
CTTOTTCTO AAA0CATOCC ACACTOCOCT 0000000001* CTO0CTOTO0 ACGOCAAOBA AAXATCTTOA CORCATAGA TOOCRAflAT CACICA000A 

8)01 AOBTGATQGA OR CACAACC AACTOCOCTT GAGTRAGAC OCtAAAAAOC ATCCCXTTTT ATATRAT0T OATEAOC30CA QOQAAACCAA QOCfCABACA 
TOCACtAOCT CAACTQROO RGAOOGGAA CXCAAAXCTO OOATRTTGG TAOOCAAAAA TA1AAAXACA CTAATO000T C0CT7TGATT C0CAOTCTOT 

8401 T OGAT AAtAC OTAnC Q GAQ TlCriCUAGC OCTAC TOOCT A00QQAAATQ AAACCTAOU TIWIUUITIT AATAT0CTTO (nt.'AJUAi: AOT00COCXA 
ACCTAT7ATC OIUIUAJLIC AACAACATO0 OCTTOOOCA lUJLL'l'LTAC TR0CA7CIV AATADCAAAA RATACGAAC UAA/TJUULU tCACODOGAT 

8501 TT U rAOTJC UAAXllATT UJjn.MJ7Vi OOCTTOTCA CAOAACXUTO TCACTTOOA0 OOCAOOTTTf CAOCTAOOTA tO CTCA AOTC TOOCCATZOT 
AACCCTOCTC AOCQCAAXAA TUUULTUCAC ATQCAACAAT CTCTTCACAC ACTCAACCTC COCT0CAAAA CTCCATOCAT AOQAORCAO ACCCOTCACA 

8801 CATOCTQOCT GTC TOCACAA CCTQCTCTOC RCT00CT0C CRC0GA1CA AOOTQTACAA CtCICAOCIC CRCXOCAOC ACCATOTCT0 OCTOCRAAT 
CIAOCACOSA CACAOOTCTT 0CACCACAOC AACACCCAOO CAAOOCTACT TOCACATCTT GACAOT0CAO CAACAOOTOO TOOnCJQSAC 0CAOCAATCA 

B701 eCTPfOCRC mCCAIGAC GA3AATGAAC t lHtAX T W AAACTQ7AAO tCAOCCOCCC AORACATOT TTTCTRTAT AAfiAOROCA tASAtMSOO 
GQAAACSAAO AAAOOIACTC CXATXACTTO ACACOCACAC RTGACATXC AOTO000O00 TCAATOTKCA AAACAAAAtA RUtCAAOOT ATA1AXATAC 

8801 TATGTATAtA tOTATOTATA tATCTATOTA TA3AXATATA TATATATAAA CA COOTC T C A CTCTTTAOCT CTOOCfOOC C VCAAATtCAC TATOTAOCCC 
A1ACATA3AT ACATACA1AT AlACASACAT AtAtAXAlAT ATATATATTT GTOCCACAOT CACAAATOOA GACOCACCOO ACTTOUU0TO ATACATOOOO 

8901 AOCAROOCT CAACT RCAA OCAAI HICC WCV1 CAO0C TCOCAAT0ST ARACAOOCA TGAOTCACAA CAAOCCATTT AAA1CRAT0 ATBACRATA 
TCCTAACOCA CRCAAACIT COTOGAAOS ACOSACTOOO AO0ORACCA TAATCTOO0T ACTCAOTCTT ORCOOTAAA TRACAA1AC tACTCAAlAT 

5001 ACAACACAQA AAATCAOACT T0CTTTAOCT A0TTOCAGA TCOCTACAAT CTAAGCTCOT TCOCTCCATX AACAODCCtA fTTT A t.XX.lt; CTOSAACXGC 
TCTTCrOTTCT Wrv.W ■ » a wruAiim Wkimrtim anrvrkwwM <f aMM-w*wM 



9101 R TSAQCAAT CCl PgOQC T CTCACAOOCA CACTCCTCCT TOOTTAATCT CTTCAO0CT0 ORCOCROC COOOOCATUT CCATOTOO0C CAAAOOCSCT 
AAACTCCTTA GCAOOTCOCA GAGTOTQOBT OT C A flC AO C A AOCAATZACA CAACTCOCAC CAAC06AAOO 000000TACA OOXACACOOO OTTT00CAGA 

9201 CAXOCR7STC T CAAAT ACCA CIMCOOTA AflOCTCCCOO A0CTGAO0CO GTTtAAATAT TACAAAAOOO TCACTRCTC CCTOCTACAO ACAACCAAAC 
CTAOCACAAa AOTTtATCCT GATCCA3CAT TOCSAO000C T00ACICO0C CAAATTTATA ATCRTTCOC AOT0AAACAO GCAGQCTOTC lOROSTHO 

9301 rAfTA T A T OC ROTCACTTX CtACCTCACT A3CAAOORA A3ACAT0TCT TCACAAOCTT TCfCXCAOQC TCMORTCOC OOCTOCAtA A90CATCXCA 
OTOOIMACO AACAOTCAAt GATOOACXGA tACRCCMT 1ASCZMCACA AOT97TQCAA ACACACRS3 A0ZCAAAOOO GXOGACOWr TAC0TAOACT 

J401 CACACACAAT ^ CI V.I AfiAO C fOTOORCTC CTCATTCCtA GIOCTOOCAC GCTRAAXAC ATXTOCTCAT ORUIULTIVA COCCACCACC AOCASAAAAT 
CTOTCICRA AOOCATCtCO A T A fTA A fi A n GACTAAOCAT C AOCAOC CTO OQAAATCAIO TAAAO0A0TA CAACACCACT OUUO IC PPW IIAVLU1T1A 

9501 TAlULAAfr GA3ACRGAT AACTCZAAR RTXCTARTJ RAtCAAtAC VUVtOXAAOC ARTCTQRT COCAGTCA1C RMBATSAOC CIOXOBAN0A 
ATAAAQOTAA CAtGAACZA RGACATtAA AAAACA3AAC AATACRA3C ATZACATTOO TAAACACAAA OOBXCACtAO AASCSCT00 OACAOCRCT 

9801 OTCAXTGCAC COGAAAOOOO yTVC'Xntf* AA0TIAACAA TZ0CT00CAT AGWBAA3CA CAOOBAOCAT CCKRAACAC 1 100P1C CAC TRTODQCXQ 
CAOTAAOOTO OQCTROOOC HJJUUTUJTQ TTCAATTCTT AAOGAOOOTA TCIOCRAOT OTCCCXQOTA CCXAATSUTO AAQXAOCTO AAAAOOOCAC 

9701 CCRCTOOOA OO0OCTACAO CXXATCACAO CTACA1CAA3 RCIGAAATT RQTOTOTOT OTOTOTqiOT OW I PTPTB T O t PWWTO T OCOCTCAOTC 
OCAAC A C 0 C T O000CA1CSC GATXACTOTC GAZOTAORA AAQACXTXAA AACACACACA CACACACACA CACACACACA CACACACACA O009ACXCM 

9801 CCCroCTGAO ATAOQ0CAQT O0CTRAOTO RCC t OB ACC CJARACTCAC CACAACICXC OCCTCAOCTO ASTCTROAT (7ZCAACACSA TOICRCATA 
COC A CC A CT C TATOOODTCA COCAAASCAC AAflSACCTQO aXAATOAOTO OVCRCACAO QOSAOfQCAC tAACAAACTA CACROICAY ACACAAOTAT 

9901 O WAAAAV IVAI CAAJ AOCAOC AACACTGAAC TAAARTXAA AAOZACAACr CMXTOSACA TACAAATATT OCAOTRTGA AOR OO O UXU GAROTRAA 
rH\. \ *.Tl CQ ORAIGSTOO RCTCACRO ARTAAAATT RCATCRCA GTOGAOCTCT ATOTTTATAA COXCAAAACT TCAAODOCAC CXAACAQATT 

10001 TAACTCAATA ACAT AACCCA C AACAflAOO C COCROOTCT TOCAAACRT ATAT00CXCA OtACAOQOCA A COOCAUaX CAACAAOTOO OAOTOOOTOO 
ARCAATEAT TOTARQ00T CRCICTCCO O OS AA O CACA AO0RTQAAA TATACOGACT CATOT000CT TOOOOTGCCQ CTTCTTCACC CTCAOCCACC 

10101 OT AOOOGAOC AQOOTOOOOq GAGQGZATAC OOGACITIQC OCATAOCXTP TGAAATGTAA ATCAAGAAAA TA7CTAATAA AAATTTCAAA AAAAA0ORA 
CATCCORCO TCOCAOCXOC CXCCCATA3C OOCTCAAAOO OCtATOOTAA ACTTCACATT T ACRCRR ATMSATEATT RTAAACTTT 11U1A CAAT 

lozoi coccAOT m; occtocatct cactacctca ac c agac to o catutgactc toctcacatc tooctacrc to o c t ccwo otocacaaga caartroo 

ODOGTCAAAC CQGAOCTACA OTCATOGACT TOGTCTGAOC CTACACTGAO ACOACTCtAO AOOGATCAAO AOOCAOCAOC CACQTCRCT OTEAAAAAOC 

« 

10)01 AAOTTACTITC TCTTCTTCCA TCTTGTOGAT tOCAOOGATT GAACTOOOOT CATCM30CR QOCTOCAAOT GACTTACRA 00TOTCT00C AGAC0CTCTC 
RCAATCAAG ACAACAAOCT ACAACACCTA AGCrCCCXAA CRCA0OOCA OTACTCCGAA COGACGTXCA CTGAATGAAT OCACACAflOO) TCTQ0GMSAO 



10401 

10)01 CAAOT GATCA GATOGCTCAO TQOGTAAGAS CACAGACTOC TCRGCAAAG CTOCCCAOR CAAATCOCAG CAATGACATA OTGOCRGCA RCORCRA 
CRCACTACT CCACCGAOIC ACCCARCTC OTCTCTGACG ACAAGCTRC CAGOOCTGAA OTRAGO0TC GTtAGTGTAT CAG0CAAOST AAOBSAGAAT 

10601 T 03AAT CTCT CAACACT OCT ACAGTCTACT TACATATAAT AAATAAATAA ATCTTAAAAA AAAAAAAOCC A U U LtAJtf.Xff UUlVflCaj t C OCCTTTAATC 
AOGTtACAGA CTTCTGACCA TOTGACATGA AXCTATATtA TRATTTATT TAGAARTR RRTR000 1CO00000CA CGA OCOOBTO CGGAAARAC 

10101 CCAGCAC RO QGAflOCACAO OCAOQOOCAT TCCTGAGRG fJCOCCAOee TOOTCtACAO) A0TGAOTSQC "^"^^ CAACXACAGA CACAAACCCT 
OOICOTGAAC OCTOCCrCTC COTCGO0C1A AOGACTGAAC C WO0BTCOO ACCAOATOTC TCACTCAMO lUtWlUAVI CRCAOTOT CTCRTGOCA 



FIG. 4B-3 
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10801 (7TCTCCAMA U*VJC*CA '?y_rrr % ' M tCftfiftOOQCA UURRU O tlTlUUlU U11UILIIHJ CTQfftQFTCXA CTCJCASM0 OUOQCICQC 
OOUXTTTT TTTTTTCTCT iimn WC H CTCfOBO Bf WXMMOT OBUUOO C CMOOUUC CaOttCMg AMSCCKRC ORM9WQCQ 

10901 TTQCTOGCAA OanOBMOT JUCXTTTCTT IAUUUUHJJI ATTTOCTCTO C11UUIU1 UrUlilUUI WODigCXQ MMflBBBMC OCMnoCCf 
MCCMK71T CC MtX T RA TBKMMM. AATTTTOCA tMUCMOC WUWUU* OCUMTM KBttOflQU C tOCBWCnO QC/1UL1UXA 

11001 T0QCAM3OSA OQClMCXUr T l M A ' ^ ff TCA OOCMWCICC MDCRQ9C TOOOQCXTCC mflPQUWBB Tt CD IC O CT GM0CM9CT OCBOtCCOOC 
MJUUTIUAT CCOMCOCfc JUOOTGftCT OOCtMQMO TOOCMtfPIQ AOOOOCXMO JO0QBRQ0C AACKXOOTGA CiUAJWIim O00O I O00BB 

11101 MCCCTCTCT OOUKKTTCr M BBWnCC jmCCBOQC TTTCXTCTTT tMOCOMBC TXftCTftOftOC TOOTT 

T MT jtfwa ca ec iM WA T ccgn utfK nagcaaono jumcducbux axtctoccjvo juhqaxctcq aorm 
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SEQUENCE LISTING 



<110> XENOGEN CORPORATION 

<120> TARGETING CONSTRUCTS AND TRANSGENIC ANIMALS PRODUCED 
THEREWITH 

<130> PXE-00B.PC 

<140> 

<141> 1999-12-16 

<150> 60/152,522 
<151> 1999-09-03 

<160> 39 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer PGKP 



<210> 2 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer PGKR 
<400> 2 

ggctgcaggt cgaaaggccc ggagatgagg 30 

<210> 3 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer NeoF 



<210> 4 

<211> 37 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer NeoR 



<400> 1 

atcgaattct accgggtagg ggaggcgctt t 



31 



<400> 3 

acctgcagcc aatatgggat cggccattga ac 



32 



<400> 4 

ggatccgcgg ccgcccccag ctggttcttt ccgcctc 



37 



WO 01/18225 
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2 



<210> 5 

<211> 30 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer TKF 

<400> 5 

ggatcctcta gagtcgagca gtgtggtttt 30 

<210> S 

<211> 30 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer TKR 



<210> 7 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description, of Artificial Sequence: primer F5R51 



<210> 8 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F5R52 



<210> 9 
<211> 77 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F3R31 
<400> 9 

ggcccgggct taattaatgc atcatatggt accgtttaaa cgcggccgca agcttgtcga 60 
cggcgcgccg gccggcc 77 

<210> 10 
<211> 77 
<212> DNA 

<213> Artificial Sequence 



<400> 6 

gagctcccgt agtcaggttt agttcgtccg 



30 



<400> 7 

gtacatttaa atcctgcagg 



20 



<400> 8 

agctcctgca ggatttaaat 



20 
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<220> 

<223> Description of Artificial Sequence: primer F3R32 
<400> 10 

gatcggccgg ccggcgcgcc gtcgacaagc ttgcggccgc gtttaaacgg taccatatga 60 
tgcattaatt aagcccg 77 

<210> 11 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VN1R 
<400> 11 

ctgtatttaa atctgcccac cctattcagg acagtagtc 39 

<210> 12 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VN1F 



<400> 12 

ccaatgcatc aacccagcca ggaggagtgc g 31 

<210> 13 
• <211> 38 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequence: primer VN2R 
<400> 13 

aacgcgtcga cttcggagat gtttcgggga taaccagg 38 

* 

<210> 14 
<211> 40 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VN2F 
<400> 14 

ttggcgcgcc ccatagagaa gagacaccaa aggcacgctc 40 

<210> 15 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer FosBlF 



<400> 15 

ctgtatttaa atcccgtttc tcactgtgcc tgtgtc 



36 



WO 01/18225 



4 



<210> 16 

<211> 35 

<212> DNA 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequence: primer FosBlR 
<400> 16 

gtctcctgca ggcttcctcc tccttgttcc ttgcg 

<210> 17 

<211> 35 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer FosB2F 

<400> 17 

aacgcgtcga cggatgggat tgacccccag ccctc 

<210> 18 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer FOSB2R 
<400> 18 

ttggcgcgcc ccttgcctcc acctctcaaa tgc 

<210> 19 
<211> 31 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VF1 
<400> 19 

acctcactct cctgtctccc ctgattccca a 

<210> 20 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VR1A 
<400> 20 

gctctggcgg tcacccccaa aagca 

<210> 21 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer VF2 
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5 



<400> 21 

ccctttccaa gacccgtgcc atttgagc 



26 



<210> 22 

<2U> 29 

<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Description of Artificial Sequence: primer VR2 

<400> 22 

actttgcccc tgtccctctc tctgttcgc 29 

<210> 23 

<211> 28 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer KF1 



<210> 24 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer KRl 
<400> 24 

ttctcaggca cagactcctt ctccgtccct 30 

<210> 25 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer KF2 



<210> 26 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer KR2 
<400> 26 

gaggactcag ggcagaaaga gagcg 25 

<210> 27 
<211> 29 
<212> DNA 



<400> 23 

gctgcgtcca gatttgctct cagatgcg 



28 



<400> 25 

cagatggacg agaaaacagt agaggcgttg gc 



32 
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<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer TF3 
<400> 27 

agcttagcct gcaagggtgg tcctcatcg 29 

<210> 28 

<211> 31 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence; primer TF2 



<400> 28 

caaatgcacc ccagagaaca gcttagcctg c 

<210> 29 
<211> 33 
<212> DNA 

<213> Artificial Sequence 



31 



<220> 

<223> Description of Artificial Sequence: primer TR1 

<400> 29 

gctttcaaca actcacaact ttgcgacttc ccg 33 

«=210> 30 

<211> 33 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F51 

<400> 30 

cccagtgtct ctgatttagg gagagcacct gag 33 

<210> 31 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer R51 

<400> 31 

ccagactgcc ttgggaaaag cgcctc 26 

<210> 32 

<211> 35 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F52 



<400> 32 

cagtgagagt cttctctgtc cctcaatcgg ttctg 



35 
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<210> 33 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer R52 

<400> 33 

tggatgtgga atgtgtgcga ggccag 26 

<210> 34 
<211> 32 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F31 



<210> 35 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer R31 
<400> 35 

cggctcccca aaatgtggaa gcaagc 26 

<210> 36 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer F32 
<400> 36 

gaatccatct tgctccaaca ccccaacatc 30 

<210> 37 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: primer R32 



<400> 34 

aatcaaagag gcgaactgtg tgtgagaggt cc 



32 



<400> 37 

cgcctcctct ccccagtctc cccttg 



26 



<210> 38 
<211> 4998 
<212> DNA 
<213> Mus sp. 



<400> 38 
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tccacccacc 
tcacccgggg 
ctgatttagg 
aaattcccca 
agtcttaaac 
gaagctcagc 
ctgagcaatg 
gactcggtag 
atcctgtcca 
caccaagcag 
caggcaaggc 
gaaatagacg 
atagttgttg 
acgtgatgaa 
acgatagcgc 
taaggagcca 
atgccagttc 
aatgcacttg 
tctatcttta 
acctacttac 
cctatttatt 
tttgcaactc 
gtctctgagt 
gaacacattt 
tgtctagagc 
aacccctcag 
caaatagagc 
cagctcttga 
agcccagttt 
cagaagagga 
aacacggctg 
tactcccagt 
ggcagagagg 
ctgagtaccc 
gaacgctgca 
accagggtcc 
cagaagggtg 
acctgctcct 
cctttacctt 
ttgatgcgag 
ttggggtacc 
ggggaaactg 
catagagtca 
gggctttcca 
gggccttagg 
aggatctagg 
tggagaggtg 
gtcatagctc 
cagaacgggg 
tctgaactca 
agtgcaaagc 
cttgcatgac 
acctgcactt 
catgccacca 
agggccttgg 
agcagactga 
ggcccagccc 
cccagcccca 
acgggcagga 
caccctctga 
ggtgccttgc 
tgcctggagt 



tgtttctcac 
agggtgtggt 
gagagcacct 
cttgccctcc 
tcgtagccaa 
cgagtgggct 
gagcgtgggt 
tatttgtctg 
ctggtcccaa 
ataatgtcac 
ctagccaaaa 
ctctgaatgg 
taggttccta 
cgacgggagt 
tttcggcttc 
gtgacgtaga 
cggctgatga 
gcagattctg 
tctatgtatc 
ctatctatgt 
tgtttgtttg 
actatgaagc 
gctgggatta 
gctatgcctt 
tttttcccat 
cgcgcagccc 
caagcctcat 
tcgatggtgt 
ccagagaaag 
gttcgaaaat 
acagagagct 
actgcttccc 
aatcatggaa 
ttgaagaagt 
tcaacattgt 
aggaccccat 
aggacatacc 
ggagtccctg 
gcctcagact 
tgaaggcagc 
caggcctcac 
tgtgagaagc 
cctcggaagg 
ctgcacagtt- 
atctcctcct 
aaggctgtcg 
ttctcgggtt 
caataatcat 
agcagtgagt 
ccttggggct 
tcgtcacact 
tctatgggag 
ccctaggtac 
gggctagtat 
caagctgggc 
agaagagttc 
cattgccctc 
ggagagctgg 
gggataacac 
gccagtcccg 
ctcatcaggt 
agcctccaac 



gtccccggcc 



gagcccagtg 
ttatccaggg 
cagacttttt 
ctgattccta 

agggaggatt 

aaagaaagaa 
gaaggataaa 
catctacagg 
ccagtctaag 
gctcgcaggt 
gcccactctc 
tcgagctctg 
tacgcttaga 
tgcggccggc 
attggggttc 
gctttgattt 
tatctatata 
atctatctat 
ttttctttga 
cataactggc 
aaagcatgtg 
gcctaagaca 
cctctctcca 
ctccttctgc 
cccccaaaag 
cacagttcca 
aagccagcct 
gttctcccag 
gccttcgcac 
tgaggagcag 
tagaacaggg 
agaccctttc 
ctggtatgcc 
cctcaaagcg 
gctggccaca 
actgctttgt 
taggtctggt 
atcgatgggg 
tgccgtctca 
agatgagcct 
caaagaggga 
cttcctctgg 
gttgctccac 
gctttagagt 
gcacaccggt 
cctctggcat 
gtcaggctgt 
tgcactgctc 
gacacttctt 
ggaatatcag 
ccaccaatcc 



gcggagcttc 
ctagttccct 
ctccaaacac 
gaaacagaaa 
gcttagctta 
ttaatctccc 
gttgagagat 
tcattcccat 



ttcctagtta 
gaagggtgct 
agagtcttct 
gacagggctg 
attgggctgg 
cttctcagag 
cacagagtcc 
tggaaaaagg 
ggctttttct 
ctgtgttcag 
gagtagaaag 
ggcaggtaca 
ctcgctggag 
gctgcgtctg 
cttctgtttt 
catagcagcg 
tctggctcca 
ctccagcaag 
tctatgtatc 
ctatcatcta 
aacaggatct 
ctcttaactc 
ccactacacc 
cacaactcag 
ctgtatccct 
tgtgttaggc 
tcaacagaag 
ggcccctccc 
tgctctccct 
ctgtcccgct 
tcctcctggc 
aacagctggc 
actccaccac 
ccggccactg 
actgaagcct 
ccagtactga 
gaagcagtcc 
cttcacagct 
accttgaaca 
ccctcaatgc 
tctagctcat 
aaggcagatc 
cccattcttg 
aaactcaggg 
tttaggcgct 
gccgtccgtc 
gttggtattg 
agtgaacacg 
ggagggagcc 
catgtagtcg 
gctggccatg 
gtttacagcc 
cctcccacac 
ctcaggggtg 
tggaatctcg 

gggtttCtgc 

agctgcagca 
accctcgcca 
gctggggagg 
tgagccttac 
tccgtgagct 
aagccgttat 



acttcatggt 
cccacaagcc 
ctgtccctca 
cccaccctat 
gagaaagaga 
gtcgggcagc 
actcgccggg 
gttatgtgag 
cagaagggaa 
cacccaggga 
gggctcccac 
agccagtcca 
aacaaagaga 
tggccacgcc 
ttggcttggg 
tccactttcc 
tctgtaacag 
gttgtctgtc 
tatctatcta 
cctacctact 
tagcacctac 
acaaagatcc 
cagctccagt 
tccccaggcc 
tgaatctctg 
aaagtccaag 
caaagtctag 
ctggaagccc 
ccataccaga 
gaagcaaggc 
tgggttgctg 
atcaggagag 
ctgccccctt 
taacggtggg 
tcggagatgt 
ctaccctgaa 
tatatcctaa 
ccccagcacg 
agtaggtctt 
cccagacatc 
agcagtactg 
cgaccgccac 
agatccgtga 
gtcccttgat 
ggggtgcttg 
cgaggattta 
ttcttgggct 
tccccccgcg 
ccaggcccac 
gcacagcagc 
aaaccctgag 
caatctaggg 
cttggtcagc 
ccatggcagg 
ctgtcctgcc 
cctttatttg 
aagggtcaca 
agaccaaagt 
tggaaagaag 
tttttataaa 
agaacagaca 
cgatttactg 



taaagaagcc 
cccagtgtct 
atcggttctg 
tcaggacagt 
tgaggctcct 
ccagccaata 
ttctaaggtt 
attctgcctg 
agtgaacatc 
ccaagacctg 
ctccagagaa 
tatcataatc 
accagattga 
ctcggcgtga 
cagagtggga 
ctggcacacc 
ggaaggggtt 
tatctattta 
tcatctacct 
tacctatcta 
ctatggctgg 
acttgcctgt 
aggaccttta 
ccagcctccc 
ccccatccga 
gtatgggatc 
ccagagcaaa 
ccactatcac 
ggatctgccc 
aaagtgctca 
ctgaaattcg 
atctgaccaa 
ctcctccacc 
caggaagggc 

ttcggggata 
agacagagat 
actggctgtc 
tccatggcac 
cccctgacag 
ttggataagt 
ccctagaaca 
cagacctgtc 
aggcgtcaaa 
cagtggtgtc 
gctgttcctc 
ggtcaccggg 
cctccacgta 
ttactgcagg 
ccaccagggc 
tctgatagta 
tgcagcggcc 
cacctgccca 



cctctagccc 



ctcatcctct 
ttcccagaac 
cagtagggtc 
catgtgttgt 
gtgggaccat 
aaacgtttcg 
tttgatcagg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 
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ctaggtgctt gtcccatcct accccccgct tcgaatctgg atttttgggg caagaagggg 3780 
ggttggggga gagctggcaa gcactttggg ggaggttttc ttttctcctc ataaaagaac 3840 
aaagcttcat ttctggcctc tccttgttct ctctaagctg ggtgttacag cataggaagt 3900 
agtgggtcag agtctattct tctttcttta ttttttttag atttatttat tttatgtttt 3960 
gtgtataagt gtctgctcac atgtgcatct gtgcaccaca tgcatgtctt gtgtctatgg 4020 
aggtcagaag agggctttga ataccctgga actggagttt tgaacagtta tgagctgccg 4080 
tgtggatgct gagaatcaaa cccaggtcct ctgtaagaac aagtactctt aaaggctgag 4140 
ccatctttcc agtcccagag cccattcctg aggctttcac taatccattg atcctcgggg 4200 
gaccaccctg gccacacctt caatgacctc atttatttta aaaaaaaaat ggactcattg 4260 
ggcatacttt ctagactcac atactaagtg ggatttctct ataaagaagt gctcactggg 4320 
gtagagtgcc aggttttggg ccaaattcca agcactggca cacttctgaa gcccctccgt 4380 
tttctgttct gtaatcacag gcgagcgtgc ctttggtgtc tcttctctat ggaccgcagt 4440 
agtctcagcg gcaaaatgaa acactaaatt ccactcccta cagacgcgtg aagcctaagt 4500 
ggaaaccggc attaaagggc tttaagaatc tcaactgcga ttctttaacc atccggaggg 4560 
gacgtggata- catgtagcca gcttgcttcc acattttggg gagccgagcg agcggtagga 4620 
aatggaagac agctctttac agccctttct acagcatctt gcacaccacc aaggggagac 4 68 0 
tggggagagg aggcggagcc aggtgtgggc gtggctggag acctggggta ggcttgcgcc 4740 
tgcgtcgggg gcggagcccg tgaaacctag aggcggggcg tcaaatcctt gactctgctg 4800 
ctcagaggcg tggttgctgt tgagcatctt agctccgctg tgcttagatt ggagcagcgc 4860 
tttgttccgg gcaccggcgt ctctaccctc ccgcgtctgg tccatgcttc tctctccctt 4920 
catgcccttc ctaagtcgct gagtcccgga gctgccctcc tccttctgct tctacacttg 4980 
tagcccagca cctttacc 4998 

<210> 39 
<211> 11176 
<212> DNA 
<213> MU9 sp. 

<400> 39 

gcagctgggc aaacgttggc gatgccggtg caaagtatat acccggtggt tagcagaagc 60 

tgagaacttt tagccgaaag ccggctccct aagccgaagc taggcaagta ggggaagaaa 120 

aagaaaaaaa aaattccaga gaagcttcca gagcctcctc ctcttccctc ttccttcaaa 180 

aggactgcaa gtccgcagtc accctccacc cagcaagagt tagggcctcg aaccccggtc 240 

acgctgcctc cgcctcctgc cgaacgtaac gggggacccg tgcgtaaagc gtgacgcgct 300 

ggaatcctcc gtctgacgcg gggcacgcac aggccgcagc ccctccgccc gccccgcccc 360 

tgacgtccgg gcacgttcta ttttggaacg ccgaggccac gttgctaagg gagggggcag 420 

cgtggctttg tgattggctg tcgcgcgcag ctttagccaa tcagcgttcc cttcctattt 480 

gtagagcgta gctcccttcc ttgctttttg tggttcttcc cgtgctgggg gtctccaaga 540 

ggagagctag gattcttgtc gcgatcggga ctcgttgtca ccccatggtc tgcgaggact 600 

tgtgtggacc tggtctgttg tcataagcta gaggcttttg gctgagtgtt agcgcctcta 660 

agggggaact gaaggcctca tccttctcag gcacacatat acgtgctcct gagctctaga 720 

cactcagtcc ttccgaggtg ttcaaacact agatgagcta gcctacggag aggcagccag 780 

gtggtctcta aaaggtctgc cttcccttag ttcccaggct ctgattggcc agggattcag 840 

cccttccctc gccacgcccc ctagagtagt taagcctcta ggattccact tgcgggaagg 900 

gggggggggg gggcgtgatg gacgcttctt ggggacgcag atcctatgtc accccatccc 960 

ctgcaagaca gtctgagaga ttctcgctgt cacttttctc tgcctatcag ttcactgaaa 1020 

cctgtcagtc tcactgggaa gagacagaca ctcggaaggg atgctctcaa ctcttaggcc 1080 

ggtcccccaa caccgttgga actgggatct ccgcctgcgg gagccctcat gcagtggggg 1140 

gtgtgtttgt gtgtgagtgg agaggaaggc ttggctaagg cctctccctc tccctccctc 1200 

tgtggtgggg gttggggggt tttggctgta tgtgtgtgtg aatgtctgtg gctccatccc 1260 

gggagtttgc caccaggttc tgtccagcct cctctcccac ccaccccccc acacctaaga 1320 

gtcaccaacc cggggtgtga ttcaccaccc gctggaaccg tgcaaccttt ccccgaggaa 1380 

gaaggaggag gtagaaggca gttgaacaga atctctcatt aaccactgcg tcacggtgta 1440 

gtggaagggt gggtgttgtg gctttttgcc tgtgacacac acatccacac ccgctcaccc 1500 

tgtgctcact cacagggtcg gtctctctta tctctcttgg gcgtgtgtgt gtcggtggct 1560 

ttgtttgtgt gtctacgcct gtgtgtgtat gtctcacccc gtaggagtgc gccggtctcg 1620 

gggaaatgcc cggctccttc gtgccaacgg tcaccgcaat cacaaccagc caggatcttc 1680 

agtggctcgt gcaacccacc ctcatctctt ccatggccca gtcccagggg cagccactgg 1740 

cctcccagcc tccagctgtt gacccttatg acatgccagg aaccagctac tcaaccccag 1800 

gcctgagtgc ctacagcact ggcggggcaa gcggaagtgg tgggccttca accagcacaa 1860 

ccaccagtgg acctgtgtct gcccgtccag ccagagccag gcctagaaga ccccgagaag 1920 

agacagtaag tatgaggcct caggagttgg gatggaggag cctagctagg gatgtgggct 1980 
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cagtttgtac agtgccttgc tgccatgcat gaagatccct agcacagcat aagccaggag 2040 
tggttatgca gacctgtaac cccagctctc agaaggtgga ggcaggagga gcaggagttc 2100 
gaggccagcc tgtgctactt atggagtcca gcctgcactg caagagatca ttattttcaa 2160 
aagttggcct tggggggagg tgggtgaggg aagtaagaga aagtgacagt aattttgtca 2220 
cttaatagtt ggaggttcct ctgaggcctc aagtctgaag gaactttacc attctggcca 2280 
gtgaggagta ggggttatta tttggggttc aggaggaagg aagttttctt agggctgata 2340 
gaggtacccc cagatctcat ggtccttatc tctgactcag cttaccccag aagaagaaga 2400 
aaagcgaagg gttcgcagag agcggaacaa gctggctgca gctaagtgca ggaaccgtcg 2460 
gagggagctg acagatcgac ttcaggcggt aaggaggagt ctgggggtgt cttgaggccg 2520 
tgctgggagc actctgcctt gttcttcccc cgtttctcac tgtgcctgtg tcctaaacga 25B0 
ggaaaccccc tcttagggaa caggggtcag tataggctga tggagtggct ccatatgcat 2640 
gctcagaccc atgcccactt actttcgact gttccccact ttccctgaat atgtccccac 2700 
atgtcaccct cctggctttc tctcagccta aggagacaag ctagaggagg taattctctc 2760 
accttctttt cttcactaaa taataatcca ttttgccttc ctgcctccat ttttttttcc 2820 
tgagctgggg atctacctgt cgtagttcag ccctcctccc ccaacttgat agcctcaagt 2880 
ttcagccctt ggctgagatg ccatcatcct gactggctct ggctggaaac tattttgtgc 2940 
taagtcaatt ccttgtctgc tacttcagct atctacagty ctgccgaact tgagctggtg 3000 
gcgcccacca agcccacttc tttctctctt ttttacctca gtgcaacccc ccacacacaa 3060 
aacttcatgc ctgccccttg aaaccagggt gcgtctctga ctccccgtcg ggaggctgaa 3120 
ggagatgggt aacagaacct cattaaaaac aacacataag cattacctac tgactcaaca 3180 
aactgtagtg tttttctttt ttcctctcaa aaaattattt cgtttgttta tttattattt 3240 
gcttatgttt gagtgagtgc tggtgcacca cagcacacat acgaggtcag agggaaattt 3300 
tcatagtttg ttctctcctt ccgtgttgtg ggtgcttgct ggcaatctcc ttcactcagt 3360 
gagctacaat gcccccttct gccctttaag gcagagtact ccttagtaca gggggaccct 3420 
ttcctcggcc tctcaaagtt gagattacaa atgttcacca tcacaccagg cttggagttc 3480 
ttgcctatca gtgacgtcca ctcctgccta gcttcttccc aaccatcttt tagtctgatg 3540 
gggaaaccga ggcacgagta gcatggtcta ccaggatttc ctcttagggg acggtcccct 3600 
cagttgggag ggagctgtcc agccccctgg atcagcagca agaatgtatg agtgtggggt 3660 
tgggcgggtg aagctactct gtgtggtcgc tgaccagcaa ttctcctttc tctgtctcct 3720 
atgacctggc cctgctggga tccattagga aactgatcag cttgaagagg aaaaggcaga 3780 
gctggagtcg gagatcgccg agctgcaaaa agagaaggaa cgcctggagt ttgtcctggt 3840 
ggcccacaaa ccgggctgca agatccccta cgaagagggg ccggggccag gcccgctggc 3900 
cgaggtgaga gatttgccag ggtcaacatc cgctaaggaa gacggcttcg gctggctgct 3960 
gccgccccct ccaccaccgc ccctgccctt ccagagcagc cgagacgcac cccccaacct 4020 
gacggcttct ctctttacac acagtgaagt tcaagtcctc ggcgacccct tccccgttgt 4080 
tagcccttcg tacacttcct cgtttgtcct cacctgcccg gaggtctccg cgttcgccgg 4140 
cgcccaacgc accagcggca gcgagcagcc gtccgacccg ctgaactcgc cctcccttct 4200 
tgctctgtaa actctttaga caaacaaaac aaacaaaccc gcaaggaaca aggaggagga 4260 
agatgaggag gagaggggag gaagcagtcc gggggtgtgt gtgtggaccc tttgactctt 4320 
ctgtctgacc acctgccgcc tctgccatcg gacatgacgg aaggacctcc tttgtgtttt 4380 
gtgctctgtc tctggttttc tgtgccccgg cgagaccgga gagctggtga ctttggggac 4440 
agggggtggg gcggggatga acacccctcc tgcatatctt tgtcctgtta cttcaaccca 4500 
acttctgggg atagatggct gactgggtgg gtagggtggg gtgcaacgcc cacctttggc 4560 
gtcttacgtg aggctggagg ggaaagagtg ctgagtgtgg ggtgcagggt gggttgaggt 4620 
cgagctggca tgcacctcca gagagaccca acgaggaaat gacagcaccg tcctgtcctt 4680 
cttttccccc acccacccat ccaccctcaa gggtgcaggg tgaccaagat agctctgttt 474 0 
tgctccctcg ggccttagct gattaactta acatttccaa gaggttacaa cctcctcctg 4800 
gacgaattga gcccccgact gagggaagtc gatgccccct ttgggagtct gctaacccca 4860 
cttcccgctg attccaaaat gtgaacccct atctgactgc tcagtctttc cctcctggga 4920 
aaactggctc aggttggatt tttttcctcg tctgctacag agccccctcc caactcaggc 4980 
ccgctcccac ccctgtgcag tattatgcta tgtccctctc accctcaccc ccaccccagg 5040 
cgcccttggc cgtcctcgtt gggccttact ggttttgggc agcagggggc gctgcgacgc 5100 
ccatcttgct ggagcgcttt atactgtgaa tgagtggtcg gattgctggg cgcgccggat 5160 
gggattgacc cccagccctc caaaactttt cctgggcctc cccttcttcc acttgcttcc 5220 
tccctcccct tgacagggag ttagactcga aaggatgacc acgacgcatc ccggtggcct 5280 
tcttgctcag gccccagact ttttctcttt aagtccttcg ccttccccag cctaggacgc 5340 
caacttctcc ccaccctggg agccccgcat cctctcacag aggtcgaggc aattttcaga 5400 
gaagttttca gggctgaggc tttggctccc ctatcctcga tatttgaatc cccaaatagt 5460 
ttttggacta gcatacttaa gagggggctg agttcccact atcccactcc atccaattcc 5520 
ttcagtccca aagacgagtt ctgtcccttc cctccagctt tcacctcgtg agaatcccac 5580 
gagtcagatt tctattttct aatattgggg agatgggccc taccgcccgt cccccgtgct 5640 
gcatggaaca ttccataccc tgtcctgggc cctaggttcc aaacctaatc ccaaacccca 5700 
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cccccagcta tttatccctt tcctggttcc caaaaagcac ttatatctat tatgtataaa 5760 

taaatatatt atatatgagt gtgcgtgtgt gtgcgtgtgc gtgcgtgcgt gcgtgcgtgc 5820 

gagcttcctt gttttcaagt gtgctgtgga gtccaaaatc gcttctgggg atttgagtca 58B0 

gactttctgg ctgtcccttt ttgtcacttt tttgttgttg tctcggctcc tctggctgtt 5940 

ggagacagtc ccggcctctc cctttatcct ttctcaagtc tgtctcgctc agaccacttc 6000 

caacatgtct ccactctcaa tgactctgat ctccggtctg tctgttaatt ctggatttgt 6060 

cggggacatg caattttact tctgtaagta agtgtgactg ggtggtagat tttttacaat 6120 

ctatatcgtt gagaattctg ggtggaaatg tctgatcagg agaagggcct gccactgccg 6180 

accacaattc attgactcca tagccctcac ccaggctgta tttgtgattt ttttcatttt 6240 

gtttttttgt attttgcacc tgaccccggg ggtgctgggg cagtctatca ctgggcagct 6300 

cccctccccc ccttggttct gcactgtcgc caataaaaag cttttaaaaa actgtatcct 6360 

tcaggtcaaa gtgtctgttt tccctggaca tctactacat ggcttccttt cagaaaaacg 6420 

gagtttggat tgctagggaa gtcttgctgg cacttagtgg gacgcctaac gaatcagaac 6480 

ctacaacggg actaaaagga agtggagact tgctaggttt tcccatgttc ccaggctggg 6540 

ccacctactt gaaaaaataa ggggcggaaa agtgtaaggt accaaatttg gtgaagggtc 6600 

tgggagaatt tcatgatcgg aaaagaattt attcaccttg ggtgtgcaat gaactttcag 6660 

caacagttaa gggcaagggt gtaaaagctg ggcacaactt gtaaatccta gcatttgaga 6720 

ggtggaggca aggggatcaa ctggtggagt tcagtgtcat gtggatcgta gataccaagc 6780 

gcaaagatct gctatgggga gagggcttgg tacaccaggg gagccagaag tttcgtggtg 6840 

agggtagtgg agggcaagtg gagagtgaga gttagcctca gggagattct acaggcaatg 6900 

atgcagagtt cagacgctcc ctttgaaagc actagagagc cgcagcaggt tttgagcaga 6960 

gaaggttaga gttaggtggt ctcttctagc ccatcccagg ctgaggagga cgctgagggt 7020 

ttcaagaagg atcgagaatg gaaagcagag gagaagaagg atccaagagg catggaggag 7080 

gcagaacaca tttctcttct ttaatagcaa gcctggaaag gataacttgc tgcaggagga 7140 

gatgctcacc agtcgggtgg tctagggggt tcttggaaaa gagaaggcat ttgctcaagc 7200 

ctcggttccc ccattctcgc tcttctgtca gcttgtcttc cattaagtgt gtgtctcaag 7260 

gccaccctgc tcaggactcc ttgtgagacg accttctatg ctcgagttca ttaaaaacac 7320 

aattgcctgg tgccgtgctc tctccactgg ctcagttacc tcaaaagacc agggctaaag 7380 

gtgtgatcac aactctatcc ccattactgc tccaacgcag agacaggact gagccggagt 7440 

gaacaaatga acaaaaatga ctaataatgc atgcgtgatt aaatacataa aagagcagat 7500 

gactggatga gcaaatcgtt taaggagaga cagcaagatc ctagaatttt ggagactaat 7560 

ttaaatccat ctttgagatg catttggtcg gaaattcctg ggaggaaaaa aagtgtaaat 7620 

atgaagagag aataaatgag aataggggtg gcttcagaga ggttaactgc gcgctggtcg 7680 

cttttgtaca agaatgtgaa ttgcagggag caaaatggga tagatactcc cgcccgaaag 7740 

gtggaattga accactctgt cgctaaacag ctacaggttt gaagcctgca ccccagacca 7800 

ctgaggatca tccgggcgaa aggagctatt ttcagttagt tatataaagg cgagatacta 7860 

ctacttttta cacttatggt cattatttgt ggtatacagt agataattaa tttcaatggt 7920 

ttcgaacatt ttttttcact ttttcttgtg aacatgtgtt tcctcagtaa agtgttccgt 7980 

gaatgactct actaactaaa aagtaagtag cttcatttgc atagcgcctt gcattttggg 8040 

aagcagcgcc taaagtgcct gtctccctaa ctaaaagcag aattttttgc aaagtgaaaa 8100 

gtcagtttta tttttgtttg tttgtttgct tgtttgtttt taatggaaaa acttctcacg 8160 

cggcccattc gtagcagaat tcgagatttt ctgcaagcga gaagcaagac tttcgtaggg 8220 

tctgacggca cgcggccgca gagcgacacc tgccgttgct ttatagaact gcaagtatgt 82 80 

agggaatcta ctgagtccct aggtgatgga gttgacaacc aactcccctt gagtttagac 8340 

gctaaaaacc atcccttttt atatttatgt gattagccca gggaaactaa ggctcagaca 8400 

tggataatac cacagccgag ttcttgtagc ccaactccct aggggaaatg aaacctacag 8460 

ttgtggtttt aatatgcttg gcccaggggc agtggcccta ttggcaggag tggccttatt 8520 

agcggaggtg taccttgtta gagaagtgtg tcacttggag gcgaggtttt gaggtacgta 8580 

tgctcaagtc tggccagtgt gatcctggct gtctgcagaa cgtggtctcc ttctggctgc 8640 

cttcggatca aggtgtagaa ctctcagctc cttctccagc accatgtctg cctgcttaac 8700 

gctttgcttc tttccatgac gataatgaac tgtgcctctg aaactgtaag tcagcccccc 8760 

agttacatgt tttcttttat aagagttgca tatatatatg tatgtatata tgtatgtata 8820 

tatgtatgta tatatatata tatatataaa cagggtctca ctctttagct ctggctggcc 8880 

tgaaattcac tatgtagccc aggattgcct gaactttgaa gcaatcttcc tgcctcagcc 8940 

tcccaatggt attacaggca tgagtcacaa caagccattt aaatcttatg atgacttata 9000 

agaagacaga aaatcagagt tcctttacct agttcacaga tccctacaat ctaacctcgt 9060 

tcgctccata aacagcccta ccccaccctc ctggaactgc tttgaggaat gctgcaggct 9120 

ctcacaggca cactcctcct tggttaatct cttcagcctg gttgccttcc ccccccatgt 9180 

ccatgtggcc caaagcctct catcctgttc tcaaatacca ctagctagta aggctccccg 9240 

acctgacccg gtttaaatat tagaaaaggg tcactttctc cctgccacag acaaccaaac 9300 

caccatatgc ttgtcactta ctacctgact atgaaggtta atagatgtct tcacaacctt 9360 

tctctgagcc tcagtttccc cacctgcata atgcatctga gacacagaat tccctagagc 9420 
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tgtggttctc 

ccccaccacc 
ttatgaatag 
gtcattccac 
cagggaccat 
ctaatgacag 
gtgtgtgtgt 
cattactcac 
gtggcggtgg 
tacaaatatt 
gaagagaggc 
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acgacagcca 
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aactgtaatt 
ttagatgacc 
ttcctgccat 
ccttctggga 
gtgtgtgtgt 
ggctttagtg 
gtgaacacta 
aagtagaact 
taacttaata 
gtacagggga 
gagggtatag 
aaatttgaaa 
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aaaaaaaccc 
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ttttctattg 
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ccccaccccc 
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